Skip to main content

A python library for accurate and scaleable data deduplication and entity-resolution

Project description

dedupe is a library that uses machine learning to perform de-duplication and entity resolution quickly on structured data. dedupe is the open source engine for dedupe.io

dedupe will help you:

  • remove duplicate entries from a spreadsheet of names and addresses

  • link a list with customer information to another with order history, even without unique customer id’s

  • take a database of campaign contributions and figure out which ones were made by the same person, even if the names were entered slightly differently for each record

dedupe takes in human training data and comes up with the best rules for your dataset to quickly and automatically find similar records, even with very large databases.

Important links:

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dedupe-1.7.3.tar.gz (49.6 kB view details)

Uploaded Source

Built Distributions

dedupe-1.7.3-cp36-cp36m-manylinux1_x86_64.whl (76.0 kB view details)

Uploaded CPython 3.6m

dedupe-1.7.3-cp36-cp36m-manylinux1_i686.whl (72.7 kB view details)

Uploaded CPython 3.6m

dedupe-1.7.3-cp36-cp36m-macosx_10_11_x86_64.whl (50.9 kB view details)

Uploaded CPython 3.6m macOS 10.11+ x86-64

dedupe-1.7.3-cp35-cp35m-manylinux1_x86_64.whl (75.8 kB view details)

Uploaded CPython 3.5m

dedupe-1.7.3-cp35-cp35m-manylinux1_i686.whl (72.5 kB view details)

Uploaded CPython 3.5m

dedupe-1.7.3-cp34-cp34m-win_amd64.whl (51.5 kB view details)

Uploaded CPython 3.4m Windows x86-64

dedupe-1.7.3-cp34-cp34m-win32.whl (50.8 kB view details)

Uploaded CPython 3.4m Windows x86

dedupe-1.7.3-cp34-cp34m-manylinux1_x86_64.whl (76.0 kB view details)

Uploaded CPython 3.4m

dedupe-1.7.3-cp34-cp34m-manylinux1_i686.whl (72.6 kB view details)

Uploaded CPython 3.4m

dedupe-1.7.3-cp27-cp27mu-manylinux1_x86_64.whl (73.7 kB view details)

Uploaded CPython 2.7mu

dedupe-1.7.3-cp27-cp27mu-manylinux1_i686.whl (71.0 kB view details)

Uploaded CPython 2.7mu

dedupe-1.7.3-cp27-cp27m-win_amd64.whl (51.6 kB view details)

Uploaded CPython 2.7m Windows x86-64

dedupe-1.7.3-cp27-cp27m-win32.whl (50.8 kB view details)

Uploaded CPython 2.7m Windows x86

dedupe-1.7.3-cp27-cp27m-manylinux1_x86_64.whl (73.7 kB view details)

Uploaded CPython 2.7m

dedupe-1.7.3-cp27-cp27m-manylinux1_i686.whl (71.0 kB view details)

Uploaded CPython 2.7m

dedupe-1.7.3-cp27-cp27m-macosx_10_11_x86_64.whl (50.5 kB view details)

Uploaded CPython 2.7m macOS 10.11+ x86-64

File details

Details for the file dedupe-1.7.3.tar.gz.

File metadata

  • Download URL: dedupe-1.7.3.tar.gz
  • Upload date:
  • Size: 49.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for dedupe-1.7.3.tar.gz
Algorithm Hash digest
SHA256 bd1399a17981e3ebcd8fd212f623772c8f05e41c4fb531beab80a9770b3abf4a
MD5 acc165f5216498a8d1e70621c56d741c
BLAKE2b-256 904d1e01973e2f6ed8ab46536209ab82dc1981fc89579c1dd36f958b1042e093

See more details on using hashes here.

File details

Details for the file dedupe-1.7.3-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.3-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 a2c18b336059b15e36c1cf014cbab3fbe15f42e83b815753083bc4511d355cdf
MD5 5bc3a19e479924217ff31a005bc05c9a
BLAKE2b-256 ac4d1333b0fec70acdb0bed75c87cf69ce765e3659c06917681fcddd9a8833d6

See more details on using hashes here.

File details

Details for the file dedupe-1.7.3-cp36-cp36m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.7.3-cp36-cp36m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 7c593f9847f525e6449a9ce96ac75d9fc0dc3901e76433da1d1000bded441fc3
MD5 6b56d9eba6c000f00562e8401bef7f6b
BLAKE2b-256 b938e2c103f08cecd130be2162f2ae2825af3463368aea2b75ae3a3369dabc11

See more details on using hashes here.

File details

Details for the file dedupe-1.7.3-cp36-cp36m-macosx_10_11_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.3-cp36-cp36m-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 b19577ecad601e352727a0f64c26cc901f469f824a39ce58db49580495392775
MD5 94b64c2050913b53b3003c90d3c53b95
BLAKE2b-256 a6956973f2f79867ac57034d14adb36859a60969e96656acd17ae0a69aea45a0

See more details on using hashes here.

File details

Details for the file dedupe-1.7.3-cp35-cp35m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.3-cp35-cp35m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 b364af3a8e893c556a358d8e61c08e3665a8c14b5b119c9ad1fc19f57df89fae
MD5 5e86bc12bb86e3b71f4656023eaf9918
BLAKE2b-256 1f757f85587e4b30163b4eab9b38eef722a8e10ff679f387afe8ef738a014122

See more details on using hashes here.

File details

Details for the file dedupe-1.7.3-cp35-cp35m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.7.3-cp35-cp35m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 10e648b5c0418917b2ffcd4363992b4d603586c8bcb80ffbbb06138db70024ed
MD5 559d8728fced5a7bf5a44efebc626320
BLAKE2b-256 70b19887bcc80a6fd74532f94ada09cf55c71c9824f78e3ac99c580da893b14f

See more details on using hashes here.

File details

Details for the file dedupe-1.7.3-cp34-cp34m-win_amd64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.3-cp34-cp34m-win_amd64.whl
Algorithm Hash digest
SHA256 bae8b55b46b621500dac4c07d3adae82397e0381409714f71ba30b80191cbe6d
MD5 595a5ac7064f3db856d7f9104fe3cf1e
BLAKE2b-256 5bfedb1c9e98fc2e9fa29955d297cce30438b9b2e86285c5abd09125658f6225

See more details on using hashes here.

File details

Details for the file dedupe-1.7.3-cp34-cp34m-win32.whl.

File metadata

File hashes

Hashes for dedupe-1.7.3-cp34-cp34m-win32.whl
Algorithm Hash digest
SHA256 e4f8d5cae1b45cf4b9e1e07f73675f5112a32fb238cc05e25a72f0f184e9f929
MD5 7ae23d173248c276fa761e64e8ec1ab5
BLAKE2b-256 273f5e19b2d3f7b7965989b0a670b6cb20f73b41f2d1285dccdc84f751e9a003

See more details on using hashes here.

File details

Details for the file dedupe-1.7.3-cp34-cp34m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.3-cp34-cp34m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 038b0207c74f1a5b8edfe058bcb42eae28c7eb95cf4a7d47b9bc88ce916427f2
MD5 3eed7d65ece949848d1b717724dab234
BLAKE2b-256 43ce9db9d6c25b4afb4dc08b0ef46235562a3e17391c3a53688097536fd71bed

See more details on using hashes here.

File details

Details for the file dedupe-1.7.3-cp34-cp34m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.7.3-cp34-cp34m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 0bc3e26b6d7e8a6ffb65a6c8358df49b129c7056646a720f452dcadc175dfdb6
MD5 7b904077e3d33e9068c8f5b327c4e592
BLAKE2b-256 91be34b91ea1d9ea7f1c1742116118775c1011286b8f9482e8771e96d5cfc078

See more details on using hashes here.

File details

Details for the file dedupe-1.7.3-cp27-cp27mu-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.3-cp27-cp27mu-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 6542dae4da9a64e64f1b3ac66ec8c0c9ba20203549d0c85b68cfd31d07d8d539
MD5 008e5de7c13c54837d5594d300a1b9ed
BLAKE2b-256 68067c8373c6635e493a877ea9cd495fe8a8ff51f6da49bef3c339647120aea3

See more details on using hashes here.

File details

Details for the file dedupe-1.7.3-cp27-cp27mu-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.7.3-cp27-cp27mu-manylinux1_i686.whl
Algorithm Hash digest
SHA256 3a9c80c0a730985f38f60a36c706f5a5193f0eb3bdb913ccb1bba2fdb00b9017
MD5 4b85b0e29bfe5f724740120fd0f0ec03
BLAKE2b-256 99a334f42c4b67e7282366ca62d7286664d8c525f4c8220ff861a5e6cedb883e

See more details on using hashes here.

File details

Details for the file dedupe-1.7.3-cp27-cp27m-win_amd64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.3-cp27-cp27m-win_amd64.whl
Algorithm Hash digest
SHA256 8b6d7a1c309465b7d72859b4ebf5b694fdae09bb5bf03b3d50e011bc2e8f9c57
MD5 d8f2cdd74a8b2b15eac73b422789da42
BLAKE2b-256 78ccc7e30214907d84fd41beaca1060b3e1d2ed4ba1c6aa9217fc03c8e49ce05

See more details on using hashes here.

File details

Details for the file dedupe-1.7.3-cp27-cp27m-win32.whl.

File metadata

File hashes

Hashes for dedupe-1.7.3-cp27-cp27m-win32.whl
Algorithm Hash digest
SHA256 4053618c1f0dfe6ef0ba1ae08b4a0fbe70c753ee34151f2c89b16f29ff9ebace
MD5 aff6ba3da76513fcbf93cb1496f3f592
BLAKE2b-256 4f39cd506af6778cf85fccfc7230fb32ccdfac3fafbd373f690e01fdd12a9f0a

See more details on using hashes here.

File details

Details for the file dedupe-1.7.3-cp27-cp27m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.3-cp27-cp27m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 f85297a37ec5aa5b5e830e46c204abf598dbe5bfdfa8b0eacd826b3e235b3395
MD5 cc716d6a91805824172632b3dbe341c0
BLAKE2b-256 8329ac9082b643f9efa9a1a6d335e919873fa5d83a0741e4716adaed7487e5a8

See more details on using hashes here.

File details

Details for the file dedupe-1.7.3-cp27-cp27m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.7.3-cp27-cp27m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 64a041d90667f918d1548bbdf1f54a21363ca905e60df30e396bc392fb1fd0dc
MD5 31f63e96f87b6eac33120c1ee34f7806
BLAKE2b-256 397b1d3a73013c49f17b4758377d039aad1b0e310fa07b6085505751c793ac53

See more details on using hashes here.

File details

Details for the file dedupe-1.7.3-cp27-cp27m-macosx_10_11_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.3-cp27-cp27m-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 a47478b25bec32cab45a0c7f7532a30e1be6c804b2ae9b3cd01b2b47143cf251
MD5 4a176f085986e9721cf30896382656ed
BLAKE2b-256 7498e1bc382ee314f82d5af1d15f5dfa7d8a5d967e1c0eeea9af5ab4b0de985c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page