Skip to main content

A python library for accurate and scaleable data deduplication and entity-resolution

Project description

dedupe is a library that uses machine learning to perform de-duplication and entity resolution quickly on structured data. dedupe is the open source engine for dedupe.io

dedupe will help you:

  • remove duplicate entries from a spreadsheet of names and addresses

  • link a list with customer information to another with order history, even without unique customer id’s

  • take a database of campaign contributions and figure out which ones were made by the same person, even if the names were entered slightly differently for each record

dedupe takes in human training data and comes up with the best rules for your dataset to quickly and automatically find similar records, even with very large databases.

Important links:

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dedupe-1.7.1.tar.gz (48.8 kB view details)

Uploaded Source

Built Distributions

dedupe-1.7.1-cp36-cp36m-manylinux1_x86_64.whl (74.7 kB view details)

Uploaded CPython 3.6m

dedupe-1.7.1-cp36-cp36m-manylinux1_i686.whl (71.4 kB view details)

Uploaded CPython 3.6m

dedupe-1.7.1-cp36-cp36m-macosx_10_11_x86_64.whl (50.3 kB view details)

Uploaded CPython 3.6m macOS 10.11+ x86-64

dedupe-1.7.1-cp35-cp35m-manylinux1_x86_64.whl (74.5 kB view details)

Uploaded CPython 3.5m

dedupe-1.7.1-cp35-cp35m-manylinux1_i686.whl (71.2 kB view details)

Uploaded CPython 3.5m

dedupe-1.7.1-cp34-cp34m-win_amd64.whl (51.1 kB view details)

Uploaded CPython 3.4m Windows x86-64

dedupe-1.7.1-cp34-cp34m-win32.whl (50.3 kB view details)

Uploaded CPython 3.4m Windows x86

dedupe-1.7.1-cp34-cp34m-manylinux1_x86_64.whl (74.6 kB view details)

Uploaded CPython 3.4m

dedupe-1.7.1-cp34-cp34m-manylinux1_i686.whl (71.4 kB view details)

Uploaded CPython 3.4m

dedupe-1.7.1-cp27-cp27mu-manylinux1_x86_64.whl (72.4 kB view details)

Uploaded CPython 2.7mu

dedupe-1.7.1-cp27-cp27mu-manylinux1_i686.whl (69.7 kB view details)

Uploaded CPython 2.7mu

dedupe-1.7.1-cp27-cp27m-win_amd64.whl (51.2 kB view details)

Uploaded CPython 2.7m Windows x86-64

dedupe-1.7.1-cp27-cp27m-win32.whl (50.3 kB view details)

Uploaded CPython 2.7m Windows x86

dedupe-1.7.1-cp27-cp27m-manylinux1_x86_64.whl (72.3 kB view details)

Uploaded CPython 2.7m

dedupe-1.7.1-cp27-cp27m-manylinux1_i686.whl (69.7 kB view details)

Uploaded CPython 2.7m

dedupe-1.7.1-cp27-cp27m-macosx_10_11_x86_64.whl (50.0 kB view details)

Uploaded CPython 2.7m macOS 10.11+ x86-64

File details

Details for the file dedupe-1.7.1.tar.gz.

File metadata

  • Download URL: dedupe-1.7.1.tar.gz
  • Upload date:
  • Size: 48.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for dedupe-1.7.1.tar.gz
Algorithm Hash digest
SHA256 a1306a12411f74c5ff3f034e44a16bd0f9dfcb1bc8a1596f7ccff8f2d1d18550
MD5 6bd7e9d1d33501bc4498f8ccac8c71c1
BLAKE2b-256 4dac16e96d446c8246692dbc3ef2ad990db16130b8f0243e4605a76bbc5be62a

See more details on using hashes here.

File details

Details for the file dedupe-1.7.1-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.1-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 2dea7a1d7c8954fd437f61d687668e8c9e4431cc5495037b94bf1cc4ac2441ee
MD5 a189a7234eeaedacf30d9e368ba09eac
BLAKE2b-256 53827f8a435a5942cc81c485230da47db4a608e13148db006846e9088e401f04

See more details on using hashes here.

File details

Details for the file dedupe-1.7.1-cp36-cp36m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.7.1-cp36-cp36m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 98f1b943a16ae603f58468bca431af7833958d00f662a7ff0b89ecf2f91a71e3
MD5 d4d7b7995adde65ac2e8dfc33bd3f36d
BLAKE2b-256 9ac5801b8c14dc074a7f327ff7b1d2d7ee3f340d3311bf1cb08913200a7de9f0

See more details on using hashes here.

File details

Details for the file dedupe-1.7.1-cp36-cp36m-macosx_10_11_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.1-cp36-cp36m-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 9717fa0a078e104fa7b06078f143997ad861d9f494671a4e2bb23a403942b702
MD5 9f01231ded20ed5450ddd702ea91059d
BLAKE2b-256 641963c797d36d640e11f9f6e3a126658d1ade2589655c7bf3a438815b4596b1

See more details on using hashes here.

File details

Details for the file dedupe-1.7.1-cp35-cp35m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.1-cp35-cp35m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 57f141365091dd4f9165f362436c31f4e75110d00ff35e3de051520555b86f20
MD5 49524df36dbd9def3fff9496d5edfbb8
BLAKE2b-256 2a51ff45316e7e5933fa43ebf544340e3e4769199ab16bc132fa651494b44069

See more details on using hashes here.

File details

Details for the file dedupe-1.7.1-cp35-cp35m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.7.1-cp35-cp35m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 7b8d97cb50d674d6f65143f6fe925097697eb56ab611c38a0d29ab171913da9c
MD5 494a76a060c928af9aac3901f5b3dae9
BLAKE2b-256 a17cb1646fa034b889882e4b66358c7315a880d24924f141075d4bf382d46feb

See more details on using hashes here.

File details

Details for the file dedupe-1.7.1-cp34-cp34m-win_amd64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.1-cp34-cp34m-win_amd64.whl
Algorithm Hash digest
SHA256 b1b9aa665aa1bc4e1c5f0894df0b0beb9e137bd4f543aae56957c15a04f7a1a2
MD5 dc1144af95ce1225b9bb3da6e18bd50c
BLAKE2b-256 201a61fdb5d3ffa9d41be41ec17831979711ca57490211d5157fe71c4ad5cba9

See more details on using hashes here.

File details

Details for the file dedupe-1.7.1-cp34-cp34m-win32.whl.

File metadata

File hashes

Hashes for dedupe-1.7.1-cp34-cp34m-win32.whl
Algorithm Hash digest
SHA256 dd7b0f10fab2a6e0b22dce7d540839549afb28942912f097187396871f2afd68
MD5 912c3c582688e9aeb017251eab869263
BLAKE2b-256 2f4392822a50b6ac42eb30e5cbd19acb452cf826ae1cab9c53bbbb27f4dbccb2

See more details on using hashes here.

File details

Details for the file dedupe-1.7.1-cp34-cp34m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.1-cp34-cp34m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 05a1d63c14f09b6f96e236db0ee79ca923479d3cff8693e8d35113dd75b6c0c7
MD5 0d22b745c40e468756143e13e8abc323
BLAKE2b-256 08ede5c011d667d35a5658672886e6254b93c91fff651ae67b261190537eb56b

See more details on using hashes here.

File details

Details for the file dedupe-1.7.1-cp34-cp34m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.7.1-cp34-cp34m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 857e646e03d29d104fb44a79005b6b5107f27d36124eab7039c9aa559d308788
MD5 0a42c8fac08254f13e6a61776ca50c8e
BLAKE2b-256 21cd0b4cc9651dac9ded3ae9afd67188483a10d9ae0e58a6d14bce20ec3c9fd1

See more details on using hashes here.

File details

Details for the file dedupe-1.7.1-cp27-cp27mu-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.1-cp27-cp27mu-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 6b62e1ace77bbb296be5c2e672c7739754c42ff871607ff9a332fd2cf1122155
MD5 e09e618c6ce4d2b314eb98116dd5a5cf
BLAKE2b-256 af7d72c5b49f8f86f0ed35a5478b4f3c6938db5b4d675e470d44873f49f1c64f

See more details on using hashes here.

File details

Details for the file dedupe-1.7.1-cp27-cp27mu-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.7.1-cp27-cp27mu-manylinux1_i686.whl
Algorithm Hash digest
SHA256 a7a36336152556b9e52c407f1630e0f87cbd32a8b644dde953f9055298f7e68b
MD5 e3e55f14eff08c38a7f97f27fac4ac1d
BLAKE2b-256 31c2322133eb179d24532d5a53744de3e9995c7190a40a5d85544f4953f384f4

See more details on using hashes here.

File details

Details for the file dedupe-1.7.1-cp27-cp27m-win_amd64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.1-cp27-cp27m-win_amd64.whl
Algorithm Hash digest
SHA256 1b74daacc552ec0ef68ee13a7ddba9966ad42530628ca331435f7490e6e88691
MD5 e7109d069ffbab8a7791c1c99e30fc35
BLAKE2b-256 ab44e28b5ee9358322523c90dda9f6f4f38182abadbe8612a5daf50140eba24a

See more details on using hashes here.

File details

Details for the file dedupe-1.7.1-cp27-cp27m-win32.whl.

File metadata

File hashes

Hashes for dedupe-1.7.1-cp27-cp27m-win32.whl
Algorithm Hash digest
SHA256 82c58b47e132108e7a5661483942a3652a7f19e4167a29d8d08f04d14d8791a3
MD5 4fbc4a2db42ddbaf18e888956e57b50f
BLAKE2b-256 22391f51904bd313d7a263d68713681a02e9953e412030b247ae1e20c765010c

See more details on using hashes here.

File details

Details for the file dedupe-1.7.1-cp27-cp27m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.1-cp27-cp27m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 563c505fedd99a65eda68af36a097c9c7d84294445527637ad80972e49e39a20
MD5 c255c7faec404deca5ec8e206ee9ad84
BLAKE2b-256 cf80edd95bb029a88606bdd850d5cf5203e228963755e771fa4a5432daefa0a1

See more details on using hashes here.

File details

Details for the file dedupe-1.7.1-cp27-cp27m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.7.1-cp27-cp27m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 21f9ef61d1d89e875df34ac1d1cfb8dfa4228cf6ea9760b104c9e5a71ea8ce2b
MD5 8c311a74bf37ccbcc59670a6e5bf197e
BLAKE2b-256 52a8da1e3ff4350900e482f8a523340be00b1149465f8916886c7903cc111535

See more details on using hashes here.

File details

Details for the file dedupe-1.7.1-cp27-cp27m-macosx_10_11_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.1-cp27-cp27m-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 e23882dfb266093b25b4a2b2e6f1d9ce0441f2d2fb4e4f9160c6249a32cb67aa
MD5 128731c0e2aa1fde10a2504ed9c615c4
BLAKE2b-256 66222a8ddcb83eeb34a8530f3eff406779eb4d84e3968bcbac125120c57fd4eb

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page