Skip to main content

A python library for accurate and scaleable data deduplication and entity-resolution

Project description

dedupe is a library that uses machine learning to perform de-duplication and entity resolution quickly on structured data. dedupe is the open source engine for dedupe.io

dedupe will help you:

  • remove duplicate entries from a spreadsheet of names and addresses

  • link a list with customer information to another with order history, even without unique customer id’s

  • take a database of campaign contributions and figure out which ones were made by the same person, even if the names were entered slightly differently for each record

dedupe takes in human training data and comes up with the best rules for your dataset to quickly and automatically find similar records, even with very large databases.

Important links:

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dedupe-1.8.0.tar.gz (54.6 kB view details)

Uploaded Source

Built Distributions

dedupe-1.8.0-cp36-cp36m-manylinux1_x86_64.whl (78.6 kB view details)

Uploaded CPython 3.6m

dedupe-1.8.0-cp36-cp36m-manylinux1_i686.whl (74.9 kB view details)

Uploaded CPython 3.6m

dedupe-1.8.0-cp36-cp36m-macosx_10_11_x86_64.whl (52.3 kB view details)

Uploaded CPython 3.6m macOS 10.11+ x86-64

dedupe-1.8.0-cp35-cp35m-manylinux1_x86_64.whl (78.4 kB view details)

Uploaded CPython 3.5m

dedupe-1.8.0-cp35-cp35m-manylinux1_i686.whl (74.7 kB view details)

Uploaded CPython 3.5m

dedupe-1.8.0-cp34-cp34m-win_amd64.whl (53.0 kB view details)

Uploaded CPython 3.4m Windows x86-64

dedupe-1.8.0-cp34-cp34m-win32.whl (52.3 kB view details)

Uploaded CPython 3.4m Windows x86

dedupe-1.8.0-cp34-cp34m-manylinux1_x86_64.whl (78.5 kB view details)

Uploaded CPython 3.4m

dedupe-1.8.0-cp34-cp34m-manylinux1_i686.whl (74.9 kB view details)

Uploaded CPython 3.4m

dedupe-1.8.0-cp27-cp27mu-manylinux1_x86_64.whl (75.8 kB view details)

Uploaded CPython 2.7mu

dedupe-1.8.0-cp27-cp27mu-manylinux1_i686.whl (72.8 kB view details)

Uploaded CPython 2.7mu

dedupe-1.8.0-cp27-cp27m-win_amd64.whl (52.9 kB view details)

Uploaded CPython 2.7m Windows x86-64

dedupe-1.8.0-cp27-cp27m-win32.whl (52.1 kB view details)

Uploaded CPython 2.7m Windows x86

dedupe-1.8.0-cp27-cp27m-manylinux1_x86_64.whl (75.7 kB view details)

Uploaded CPython 2.7m

dedupe-1.8.0-cp27-cp27m-manylinux1_i686.whl (72.8 kB view details)

Uploaded CPython 2.7m

dedupe-1.8.0-cp27-cp27m-macosx_10_11_x86_64.whl (51.8 kB view details)

Uploaded CPython 2.7m macOS 10.11+ x86-64

File details

Details for the file dedupe-1.8.0.tar.gz.

File metadata

  • Download URL: dedupe-1.8.0.tar.gz
  • Upload date:
  • Size: 54.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for dedupe-1.8.0.tar.gz
Algorithm Hash digest
SHA256 8a6df126cd4d9a2cd8271a0eefc975ba983cae3d8a791aec0b979632aa360026
MD5 07b02508c37afa8e4ab05ca5fb17345c
BLAKE2b-256 8b5a19d465794317f94522a6f2b65d96bfeaf63efa9f8b172c7b08bd38e220a6

See more details on using hashes here.

File details

Details for the file dedupe-1.8.0-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.8.0-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 12785352dc84807078cf2b3a687e79189d3c580e998033aa907087d9900571cd
MD5 c77138a7f1f48ae1bbaf9da4f55774ee
BLAKE2b-256 5ae47950abad14f912ed3a83aa3b800e124bcc388e59476f13561093ed002a90

See more details on using hashes here.

File details

Details for the file dedupe-1.8.0-cp36-cp36m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.8.0-cp36-cp36m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 d22e23ad206f70dd18fd7565d2df3e2d838886eaac1809a4f207fdc47f55c3c3
MD5 ec997cb87f2043af73b34bfe32508cd9
BLAKE2b-256 a86548a0d477cabfb52f5d457a71470f16ecba4a3a0cf0117178df9b723b4870

See more details on using hashes here.

File details

Details for the file dedupe-1.8.0-cp36-cp36m-macosx_10_11_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.8.0-cp36-cp36m-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 6d09ff02c2f2f66b3e04da286f603f848a64963435ab392b59d39a9ca5cdd3e6
MD5 39a505431d3d64322eeb4dffea4cf87b
BLAKE2b-256 94386daff6f59855bfc70fb320cdfe070447566080094bf498d85fc16537f689

See more details on using hashes here.

File details

Details for the file dedupe-1.8.0-cp35-cp35m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.8.0-cp35-cp35m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 cce68e12c4c0f4bfde744165d54bfa812c35140c5a543849631e7312e9d184b4
MD5 4d7f9294cedc52633410907431ce127f
BLAKE2b-256 974dc01450f0da7e38b287325e1b30d46cba531850efe400b21044f27ad96b40

See more details on using hashes here.

File details

Details for the file dedupe-1.8.0-cp35-cp35m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.8.0-cp35-cp35m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 b1d703dccd57c832a67d7141d86080a6f0e05f5f456c34f7bf131043e94062dc
MD5 444cc9884760d2fd3d6fe93625a59f6a
BLAKE2b-256 2433d372d93dddbb4ae11f6dba56c7e439a22792830278e5480bf5a9613ff62b

See more details on using hashes here.

File details

Details for the file dedupe-1.8.0-cp34-cp34m-win_amd64.whl.

File metadata

File hashes

Hashes for dedupe-1.8.0-cp34-cp34m-win_amd64.whl
Algorithm Hash digest
SHA256 aaf7b79b42bfa7f2ba8d4f3a6af4052ca52a1ffbf2d7a323df32b6d8a74aa52d
MD5 42bad77aad20bdbe1d0d19bd32ee4a60
BLAKE2b-256 4560b8aaaaf48f75a7ac4a38d5baab88223ee766e189dee0924ad0f88f02347d

See more details on using hashes here.

File details

Details for the file dedupe-1.8.0-cp34-cp34m-win32.whl.

File metadata

File hashes

Hashes for dedupe-1.8.0-cp34-cp34m-win32.whl
Algorithm Hash digest
SHA256 6f72457c486c5a50eac1717e6f5cb94264fd004aadd5ff2977fb00aab5580497
MD5 ee16a25c1523b7dad0b8eb9313e1072a
BLAKE2b-256 2dbcc6fa78d9bac3874472335c859202b5caf9eb5557091870984fcfbf86a5c5

See more details on using hashes here.

File details

Details for the file dedupe-1.8.0-cp34-cp34m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.8.0-cp34-cp34m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 908b0bb539f94ef8bcf0799e6d219145e5f6dc9cf722b064f9bd152c65f5d0f5
MD5 ea6018f17047e271896a6ed547ac62e9
BLAKE2b-256 9e58f7b08d3e0559e89cf61c8fef89b15444936cc328270a9c711685249fe9dd

See more details on using hashes here.

File details

Details for the file dedupe-1.8.0-cp34-cp34m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.8.0-cp34-cp34m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 c8d1cf539682f881783fb2588d91bda00cbda8230f8bcf8d53c8131e923b0e87
MD5 d71379e3fef5dd8bdbef9becd8b315c8
BLAKE2b-256 90eba2b6eb43828f288cadffc29195aa6f0b5c017d311f572b4b0929f04131d3

See more details on using hashes here.

File details

Details for the file dedupe-1.8.0-cp27-cp27mu-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.8.0-cp27-cp27mu-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 bbd081cb61c2320eb85cc40196010d7d2aea45e13317dc70b2c4ab150d8ddfa9
MD5 1fb7cabdb7b781ea99301309d9eeab06
BLAKE2b-256 8a1085d34208da926392067fcf413e9739b74c9cfed15a7686d44e1ba616aa50

See more details on using hashes here.

File details

Details for the file dedupe-1.8.0-cp27-cp27mu-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.8.0-cp27-cp27mu-manylinux1_i686.whl
Algorithm Hash digest
SHA256 8e578c73cb7c0bc6b3548702c604f06ccfbc5a76d36a4b224551bd13c40b6082
MD5 792f7559837fc3a5eec5fc7c6984bfd7
BLAKE2b-256 3f2db03571fe7932d55bb3fc0f8e849777599a2d470a6b9389efdff8883998d8

See more details on using hashes here.

File details

Details for the file dedupe-1.8.0-cp27-cp27m-win_amd64.whl.

File metadata

File hashes

Hashes for dedupe-1.8.0-cp27-cp27m-win_amd64.whl
Algorithm Hash digest
SHA256 64519bd16227cab450165f6148470310117a7db3b4a90fb343a82dd8f33ec4f0
MD5 dbc24365724e07d236e24bb64be0328f
BLAKE2b-256 2f38372027480001cff8e5c60ecd2bc17208f32df6b8b173ea967d5d5569d0d7

See more details on using hashes here.

File details

Details for the file dedupe-1.8.0-cp27-cp27m-win32.whl.

File metadata

File hashes

Hashes for dedupe-1.8.0-cp27-cp27m-win32.whl
Algorithm Hash digest
SHA256 1124eea6756279d0df8356cca4dcbcb39e2f6deb966c7c8b53465fe9731f0083
MD5 eaa6242a1df74c7aeb39d7406649f708
BLAKE2b-256 1641b28ac77fbb89c32d340f4cf7b5284153194c1880286f0721f3cf5a256196

See more details on using hashes here.

File details

Details for the file dedupe-1.8.0-cp27-cp27m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.8.0-cp27-cp27m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 465c13f75430ccb2bf3520e9fa63d7428b12b4913b9aa472938fe6c6e1aee6ae
MD5 4e105bf041db64bf41618654785f8f68
BLAKE2b-256 a459472da6951caf7ab87d934b992d28c656cc124b51c1b89ec014eff4fa2f75

See more details on using hashes here.

File details

Details for the file dedupe-1.8.0-cp27-cp27m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.8.0-cp27-cp27m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 2427c939c814ccb00d5a353df755c3935377841732b7b819efcd942260a91487
MD5 5be747f2b8017bee243a0d25e80ff4ab
BLAKE2b-256 563bd27dcd6272e22902aa5f0956719588e7bacd2f6f13cb3d80a9a005fb3623

See more details on using hashes here.

File details

Details for the file dedupe-1.8.0-cp27-cp27m-macosx_10_11_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.8.0-cp27-cp27m-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 cc8f7a33fc9ea9cd946cbd55522471435c18f190bfe87a49952b4c1d569690f2
MD5 3f853d5a8923b4b58cac4606e0116bdb
BLAKE2b-256 daace4305f86e0c9c112c45c9a191b7a9282429ac95575a3626a17c8341cfc72

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page