Skip to main content

A python library for accurate and scaleable data deduplication and entity-resolution

Project description

dedupe is a library that uses machine learning to perform de-duplication and entity resolution quickly on structured data. dedupe is the open source engine for dedupe.io

dedupe will help you:

  • remove duplicate entries from a spreadsheet of names and addresses

  • link a list with customer information to another with order history, even without unique customer id’s

  • take a database of campaign contributions and figure out which ones were made by the same person, even if the names were entered slightly differently for each record

dedupe takes in human training data and comes up with the best rules for your dataset to quickly and automatically find similar records, even with very large databases.

Important links:

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dedupe-1.6.11.tar.gz (48.6 kB view details)

Uploaded Source

Built Distributions

dedupe-1.6.11-cp36-cp36m-manylinux1_x86_64.whl (74.4 kB view details)

Uploaded CPython 3.6m

dedupe-1.6.11-cp36-cp36m-manylinux1_i686.whl (71.2 kB view details)

Uploaded CPython 3.6m

dedupe-1.6.11-cp36-cp36m-macosx_10_11_x86_64.whl (50.1 kB view details)

Uploaded CPython 3.6m macOS 10.11+ x86-64

dedupe-1.6.11-cp35-cp35m-manylinux1_x86_64.whl (74.2 kB view details)

Uploaded CPython 3.5m

dedupe-1.6.11-cp35-cp35m-manylinux1_i686.whl (70.9 kB view details)

Uploaded CPython 3.5m

dedupe-1.6.11-cp34-cp34m-win_amd64.whl (50.8 kB view details)

Uploaded CPython 3.4m Windows x86-64

dedupe-1.6.11-cp34-cp34m-win32.whl (50.1 kB view details)

Uploaded CPython 3.4m Windows x86

dedupe-1.6.11-cp34-cp34m-manylinux1_x86_64.whl (74.4 kB view details)

Uploaded CPython 3.4m

dedupe-1.6.11-cp34-cp34m-manylinux1_i686.whl (71.1 kB view details)

Uploaded CPython 3.4m

dedupe-1.6.11-cp27-cp27mu-manylinux1_x86_64.whl (72.1 kB view details)

Uploaded CPython 2.7mu

dedupe-1.6.11-cp27-cp27mu-manylinux1_i686.whl (69.4 kB view details)

Uploaded CPython 2.7mu

dedupe-1.6.11-cp27-cp27m-win_amd64.whl (50.9 kB view details)

Uploaded CPython 2.7m Windows x86-64

dedupe-1.6.11-cp27-cp27m-win32.whl (50.1 kB view details)

Uploaded CPython 2.7m Windows x86

dedupe-1.6.11-cp27-cp27m-manylinux1_x86_64.whl (72.1 kB view details)

Uploaded CPython 2.7m

dedupe-1.6.11-cp27-cp27m-manylinux1_i686.whl (69.4 kB view details)

Uploaded CPython 2.7m

dedupe-1.6.11-cp27-cp27m-macosx_10_11_x86_64.whl (49.7 kB view details)

Uploaded CPython 2.7m macOS 10.11+ x86-64

File details

Details for the file dedupe-1.6.11.tar.gz.

File metadata

  • Download URL: dedupe-1.6.11.tar.gz
  • Upload date:
  • Size: 48.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for dedupe-1.6.11.tar.gz
Algorithm Hash digest
SHA256 548cec91b25f102464395edef1bd0f576b9ed2c5d8c47d2734dcad40ee627d10
MD5 2226ce8317f33252096dca7022077e10
BLAKE2b-256 ea829fbbcb5c6904dab65feda518a33ab86deaa2b7c09c123aac864bae0f3e8c

See more details on using hashes here.

File details

Details for the file dedupe-1.6.11-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.11-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 c6393678746d2fa8efaab2623b6f9d5bcaca69ca6be640b906146f7d62867dea
MD5 1d5edbc75176dcf153f27e1efc5a4891
BLAKE2b-256 5f91aefea2d47a61681569988e34a33e0829f1b7a14a5feba46f660388e0ded0

See more details on using hashes here.

File details

Details for the file dedupe-1.6.11-cp36-cp36m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.6.11-cp36-cp36m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 0fda89bae4b0f471a2efed1b80d5697c179488ff9e90a45578e8397b71bd6cc2
MD5 8e414b14fee565a4cbd4164069672936
BLAKE2b-256 c27f2099bc44cf9f8106c8f242db23514f4e63d79f902f3b9fe3571c3e9e63a4

See more details on using hashes here.

File details

Details for the file dedupe-1.6.11-cp36-cp36m-macosx_10_11_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.11-cp36-cp36m-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 298119bf282d065ee5082e06e6b087d6a16d18e0d1126e1f7f08f2228f5e0c76
MD5 a5ec51089b76d880063c8b0dcecec3be
BLAKE2b-256 4df2748248c4ef912e670b8ea32275f7594588ad0bc7a8ec9791114015a0a017

See more details on using hashes here.

File details

Details for the file dedupe-1.6.11-cp35-cp35m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.11-cp35-cp35m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 057c79a218d81487a76afb7a5a41c1fb3ecb768f6ea5cbed299ae0516682527f
MD5 c1f109a262727c44bc5b8425f7a5f528
BLAKE2b-256 92a6cea614442e4bd27a756f0eed0955d59799eb83c5b787d83acb11fc1be2fa

See more details on using hashes here.

File details

Details for the file dedupe-1.6.11-cp35-cp35m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.6.11-cp35-cp35m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 5ea8699622c527cbaa947331e57ecef272a89ecc371ba4b103c399d580b943e1
MD5 92631bc25b7a5d325b06169372b0964d
BLAKE2b-256 19512d9698b73520ad23a0b87cf6a86e4f196f59f6fa9801256e7c146e874caf

See more details on using hashes here.

File details

Details for the file dedupe-1.6.11-cp34-cp34m-win_amd64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.11-cp34-cp34m-win_amd64.whl
Algorithm Hash digest
SHA256 20298eaaf90b9261ed02b07a077060e1e18bf70a8e8a71a2d1bc94e6dc2d62ff
MD5 57eb100782445dc5adf3f7e7c638812b
BLAKE2b-256 0c2dc30982b51d54967760f2e16d15fe1f90319d457d4b81dad478d74a0c3807

See more details on using hashes here.

File details

Details for the file dedupe-1.6.11-cp34-cp34m-win32.whl.

File metadata

File hashes

Hashes for dedupe-1.6.11-cp34-cp34m-win32.whl
Algorithm Hash digest
SHA256 576e860e8133ae2042999dead8952ca8f2eba75b2c4a8279a627b60d6238f828
MD5 595d4a8bba7c8ee29fa60d89de252720
BLAKE2b-256 9eceb019560d6fc6c4fbc5f96f3b694d00d68c5ed3108ea8f1f0f1de2761f461

See more details on using hashes here.

File details

Details for the file dedupe-1.6.11-cp34-cp34m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.11-cp34-cp34m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 fa2a77cc955b537e402b5121da6b0e518f11dd1c0a7371ba9f45ca23ec6da84e
MD5 51fc3e2affb7c1a6522d36e86df67e54
BLAKE2b-256 f9819c190ca76855e8849df58c2c8d13c2fb1675f9508d1b24dd6c7afe483dfa

See more details on using hashes here.

File details

Details for the file dedupe-1.6.11-cp34-cp34m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.6.11-cp34-cp34m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 3fa7ce73236edefb71201de79c168add7cbf4b72f732a333afd6bcd42210babf
MD5 67e20937284b47cf7698abf8724bbebf
BLAKE2b-256 67acee37e94bab5b37217fc3ee9bb24f9311ba4b20e5f02a8ce2a33140418863

See more details on using hashes here.

File details

Details for the file dedupe-1.6.11-cp27-cp27mu-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.11-cp27-cp27mu-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 d723b3777696f125e48c5ed0dfae7271e152b692a2d66cd3cd7842b0280dd16f
MD5 01409fc6c7ad1a9cf9b08c054801944d
BLAKE2b-256 1e7f5c8d0a303ed6c13ec06be11d92f934c9d156d3e03ae8cb2859b62c8099f5

See more details on using hashes here.

File details

Details for the file dedupe-1.6.11-cp27-cp27mu-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.6.11-cp27-cp27mu-manylinux1_i686.whl
Algorithm Hash digest
SHA256 0c90a40872cfcfaa2b8289a27532389630580649598ca8a1953340515237119a
MD5 1494fb3fa6d149f8fd669005d0e31b00
BLAKE2b-256 901a2ef01bfe816ea33cd05b36ea69c0c5f21bc7d228944803ee1ae0f82b2f3d

See more details on using hashes here.

File details

Details for the file dedupe-1.6.11-cp27-cp27m-win_amd64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.11-cp27-cp27m-win_amd64.whl
Algorithm Hash digest
SHA256 6113141afdda1c0fd8631c537dbe2ad5b6a37cc4ace68f4c949691df10473f28
MD5 d5e83501ea7bcd43d6f93f26d7e8d694
BLAKE2b-256 51715eee149a8b6e9806ff8ea1cf8a4453bb91cac95415ca2742c7d74f51c41f

See more details on using hashes here.

File details

Details for the file dedupe-1.6.11-cp27-cp27m-win32.whl.

File metadata

File hashes

Hashes for dedupe-1.6.11-cp27-cp27m-win32.whl
Algorithm Hash digest
SHA256 18e9884297f09bd4aa6ed26ce28977cc5e529d4b4261a794c96a259f5669b8dc
MD5 678a9920892abae7adde33ad60d234e8
BLAKE2b-256 134c76c62daf585ea88e4834feaa669ecf5a958447cfab6819127d8c84ef9729

See more details on using hashes here.

File details

Details for the file dedupe-1.6.11-cp27-cp27m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.11-cp27-cp27m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 4eed534b6579ea821e990bf05e4caabbd92c6c9f7aeb2416641002a98940a07f
MD5 07d2349e4660be6fb29b4b3bfd680dcb
BLAKE2b-256 1ce273b275e650b9442ff02d425db36bb610e4cc38f9e79f71aa309e5953345b

See more details on using hashes here.

File details

Details for the file dedupe-1.6.11-cp27-cp27m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.6.11-cp27-cp27m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 6a529465879cf80626d7f9b9ad5a41fba8c6336c5e787cbad667c761dd916e1d
MD5 e29e84c7657b83d226f206f11dbc63dd
BLAKE2b-256 c98c303ef9580d8376ee9aaa7f5fdcd5bd84b9da59632dcd0d337db8b1e5c430

See more details on using hashes here.

File details

Details for the file dedupe-1.6.11-cp27-cp27m-macosx_10_11_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.11-cp27-cp27m-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 50b4d4ca8155bc6cab2cfcfdef0b6503baa4d1c1b768ae2d3d97a8d35bce3095
MD5 b5a62add985779023b68051ba3ebe92b
BLAKE2b-256 28046ac4f3abdfcb22840420e177dc3a9ddb4c2b4c3b75d323f7733339af67f2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page