Skip to main content

A python library for accurate and scaleable data deduplication and entity-resolution

Project description

dedupe is a library that uses machine learning to perform de-duplication and entity resolution quickly on structured data.

dedupe will help you:

  • remove duplicate entries from a spreadsheet of names and addresses

  • link a list with customer information to another with order history, even without unique customer id’s

  • take a database of campaign contributions and figure out which ones were made by the same person, even if the names were entered slightly differently for each record

dedupe takes in human training data and comes up with the best rules for your dataset to quickly and automatically find similar records, even with very large databases.

Important links:

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dedupe-1.6.7.tar.gz (47.3 kB view details)

Uploaded Source

Built Distributions

dedupe-1.6.7-cp36-cp36m-manylinux1_x86_64.whl (73.9 kB view details)

Uploaded CPython 3.6m

dedupe-1.6.7-cp36-cp36m-manylinux1_i686.whl (70.6 kB view details)

Uploaded CPython 3.6m

dedupe-1.6.7-cp36-cp36m-macosx_10_11_x86_64.whl (49.6 kB view details)

Uploaded CPython 3.6m macOS 10.11+ x86-64

dedupe-1.6.7-cp35-cp35m-manylinux1_x86_64.whl (73.7 kB view details)

Uploaded CPython 3.5m

dedupe-1.6.7-cp35-cp35m-manylinux1_i686.whl (70.4 kB view details)

Uploaded CPython 3.5m

dedupe-1.6.7-cp34-cp34m-win_amd64.whl (50.3 kB view details)

Uploaded CPython 3.4m Windows x86-64

dedupe-1.6.7-cp34-cp34m-win32.whl (49.6 kB view details)

Uploaded CPython 3.4m Windows x86

dedupe-1.6.7-cp34-cp34m-manylinux1_x86_64.whl (73.9 kB view details)

Uploaded CPython 3.4m

dedupe-1.6.7-cp34-cp34m-manylinux1_i686.whl (70.6 kB view details)

Uploaded CPython 3.4m

dedupe-1.6.7-cp27-cp27mu-manylinux1_x86_64.whl (71.6 kB view details)

Uploaded CPython 2.7mu

dedupe-1.6.7-cp27-cp27mu-manylinux1_i686.whl (68.9 kB view details)

Uploaded CPython 2.7mu

dedupe-1.6.7-cp27-cp27m-win_amd64.whl (50.4 kB view details)

Uploaded CPython 2.7m Windows x86-64

dedupe-1.6.7-cp27-cp27m-win32.whl (49.6 kB view details)

Uploaded CPython 2.7m Windows x86

dedupe-1.6.7-cp27-cp27m-manylinux1_x86_64.whl (71.5 kB view details)

Uploaded CPython 2.7m

dedupe-1.6.7-cp27-cp27m-manylinux1_i686.whl (68.9 kB view details)

Uploaded CPython 2.7m

dedupe-1.6.7-cp27-cp27m-macosx_10_11_x86_64.whl (49.2 kB view details)

Uploaded CPython 2.7m macOS 10.11+ x86-64

File details

Details for the file dedupe-1.6.7.tar.gz.

File metadata

  • Download URL: dedupe-1.6.7.tar.gz
  • Upload date:
  • Size: 47.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for dedupe-1.6.7.tar.gz
Algorithm Hash digest
SHA256 a6cafa1c0cd2ca87f45181672a68267455e731572443361139a1c91c55839d20
MD5 c7188ed17069102c056015bd826e5515
BLAKE2b-256 9247b130ae1df72d98895fcb639eb6f9013e70cb856281b69510fef85f343fc1

See more details on using hashes here.

File details

Details for the file dedupe-1.6.7-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.7-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 0c5bc6df911233e5fd219556e6492c747fa5f12aecfe74ed9958632f541ee123
MD5 31e1d1518bfb0c3119d715d46437e54a
BLAKE2b-256 6e26a45a504e101cb9b1135d7387360c8d186b035291d64b7d8f233c522ad511

See more details on using hashes here.

File details

Details for the file dedupe-1.6.7-cp36-cp36m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.6.7-cp36-cp36m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 091d3d55cd61c76cb70831c2d2c0b66c61ec45a62347edea971b8666b95ae827
MD5 1e7750727a9631bf4ba51953568a0818
BLAKE2b-256 1908eb4a7b292f001ccc8d08c8c6ffd6f3fa44d838cbba1f2b7d13b3a44aec51

See more details on using hashes here.

File details

Details for the file dedupe-1.6.7-cp36-cp36m-macosx_10_11_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.7-cp36-cp36m-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 783cc87ce158df35c9b99a124caffc8e0dbb848dff9c00d370eea05e1a88c331
MD5 4ba3e90ad0c1107c38d69cf3fd7c8590
BLAKE2b-256 949e0191a0a1228224c9407e8b3c6e4344c127e04d2b2ad5e3588e991168065c

See more details on using hashes here.

File details

Details for the file dedupe-1.6.7-cp35-cp35m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.7-cp35-cp35m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 9e1a6c0cd331bc8599dcef71467b17f1944d60d2ca54bbe653adad88934f78f1
MD5 780d213d910b64e3d85f1d8c7325a2cd
BLAKE2b-256 7468d045a9c2a83ba5c60dd3625804cc8fe8f83175e57d58fd5f86c95d055a5a

See more details on using hashes here.

File details

Details for the file dedupe-1.6.7-cp35-cp35m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.6.7-cp35-cp35m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 0e00bcc9ad86542ebcd86873b91d13af046cc88bdbd6985d68809a2bb2704907
MD5 2569e2e16ce9ab45dfb716aab7a5dd25
BLAKE2b-256 15f35f5e58f7e784edf44f3751061714394d1c7fd425639b7a7faddd29c441e4

See more details on using hashes here.

File details

Details for the file dedupe-1.6.7-cp34-cp34m-win_amd64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.7-cp34-cp34m-win_amd64.whl
Algorithm Hash digest
SHA256 0873cc738fc4be32ce99a6de53c84ce2bab5684dc1fceb99b7b035cc10d20ed0
MD5 79e154be13dcd68429b25f9f6874c899
BLAKE2b-256 dff89d083486cc29d2908608c0b5ccde5b81ba9cf06d9706639ee342e2539e67

See more details on using hashes here.

File details

Details for the file dedupe-1.6.7-cp34-cp34m-win32.whl.

File metadata

File hashes

Hashes for dedupe-1.6.7-cp34-cp34m-win32.whl
Algorithm Hash digest
SHA256 e33e2bcc925bc27ee260e5c71855981ef90cf217e0fd29509597f4f6082613c6
MD5 3e74d05d1a71445908b6d36db9a3fa31
BLAKE2b-256 16f8f34b2a36cfdf006282c4affa8dc8b1e6b68ca808af3b87f7b69dd52578dc

See more details on using hashes here.

File details

Details for the file dedupe-1.6.7-cp34-cp34m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.7-cp34-cp34m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 1b8d6d73d76b615cbddf0eee28dd5d99c5bcf79d0adc8e6086874af081dc3ff8
MD5 79e4e28f85e275e5b1dfc18b968791cd
BLAKE2b-256 d02a0be1314a7d1510ec2f1e99999711e9d329ebb8432e282f1e54e7a65e5d41

See more details on using hashes here.

File details

Details for the file dedupe-1.6.7-cp34-cp34m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.6.7-cp34-cp34m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 11c59fa420041403cdc6de51e032121b3cb525707f683ff9857e507e3640c45a
MD5 9ae7a11d222cf08a91cb546a54f4cc75
BLAKE2b-256 0ea062fe21ae75bf349db2a1b891fbffd2336fa3c2331a93d94ba368998bcc91

See more details on using hashes here.

File details

Details for the file dedupe-1.6.7-cp27-cp27mu-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.7-cp27-cp27mu-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 11086e0cf4377fa62b18dc412be2d7b6ef650b906b3b738b774c71b92ddce0b2
MD5 8c87b28acb6837d5918a59d90e7a1ac6
BLAKE2b-256 b603bc7bd04bd6facb18156c6501a41fa2e5f803d06af6f81a5ec70f9f0fe891

See more details on using hashes here.

File details

Details for the file dedupe-1.6.7-cp27-cp27mu-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.6.7-cp27-cp27mu-manylinux1_i686.whl
Algorithm Hash digest
SHA256 cae81ebead1fb16a2210480e3e880b9f5a41853d1fcd97b5ac538e292ae96220
MD5 ee78742c39e149ce0ce8417373c7f60f
BLAKE2b-256 84b18d7b203ea5a6df3dc6a6fdcf19264bc91e5d0aeff0f39223cf64c86e24f8

See more details on using hashes here.

File details

Details for the file dedupe-1.6.7-cp27-cp27m-win_amd64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.7-cp27-cp27m-win_amd64.whl
Algorithm Hash digest
SHA256 930b59a0522c5d0dd874c2349484a172ba39774cdc3659732d6341f1baa1e1f6
MD5 a52a6a9bfff3a0b18eeafd3a07441f74
BLAKE2b-256 78d9539a59ae0b9a0347a0419696a6df4d35f088ab10af59a0571b205551d32e

See more details on using hashes here.

File details

Details for the file dedupe-1.6.7-cp27-cp27m-win32.whl.

File metadata

File hashes

Hashes for dedupe-1.6.7-cp27-cp27m-win32.whl
Algorithm Hash digest
SHA256 dee24a41c4c376f09cd73fc45094e8db4d86a611f2fc5a1b015e4c5479635050
MD5 fdc243ff555e5b82389d1748a097268d
BLAKE2b-256 5bb31bdb5a1a78a4dbdf3dd025c8bfdd56037079bd73fcde2f1a783db7171f0b

See more details on using hashes here.

File details

Details for the file dedupe-1.6.7-cp27-cp27m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.7-cp27-cp27m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 b34d132e816e96f93cf0b3fa47f1fd77194fc1cf5321df46281a47bc8c5f2ba8
MD5 8a753257636f15215678c0f9379cbfca
BLAKE2b-256 f0591e912dbece19d3caac7825e5d0bd4b1603fd4fa760981e5f411bbc8b56e0

See more details on using hashes here.

File details

Details for the file dedupe-1.6.7-cp27-cp27m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.6.7-cp27-cp27m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 2f4a961bef4c16c18130771259b20b6fd9ed96f00fd0beb45e3d8fedfe108904
MD5 01afbd732a8bc9659c98b8bdaf0757ed
BLAKE2b-256 12ab54e107aecfb7b868c8925c4937f5e959e1aa627d0a1302cc3f8badba0057

See more details on using hashes here.

File details

Details for the file dedupe-1.6.7-cp27-cp27m-macosx_10_11_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.7-cp27-cp27m-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 3c3beda02835749bbc345ce957dda6d9764e4a1de24c4aa870b35574f5eb824f
MD5 cf03902d460fe3cbe150eb081a49a5c3
BLAKE2b-256 4eccb223b1027f95afa93ea13c20982c54b3ff8016a2a6e33f1392ea989fd77a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page