Skip to main content

A python library for accurate and scaleable data deduplication and entity-resolution

Project description

dedupe is a library that uses machine learning to perform de-duplication and entity resolution quickly on structured data. dedupe is the open source engine for dedupe.io

dedupe will help you:

  • remove duplicate entries from a spreadsheet of names and addresses

  • link a list with customer information to another with order history, even without unique customer id’s

  • take a database of campaign contributions and figure out which ones were made by the same person, even if the names were entered slightly differently for each record

dedupe takes in human training data and comes up with the best rules for your dataset to quickly and automatically find similar records, even with very large databases.

Important links:

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dedupe-1.6.14.tar.gz (48.6 kB view details)

Uploaded Source

Built Distributions

dedupe-1.6.14-cp36-cp36m-manylinux1_x86_64.whl (74.5 kB view details)

Uploaded CPython 3.6m

dedupe-1.6.14-cp36-cp36m-manylinux1_i686.whl (71.2 kB view details)

Uploaded CPython 3.6m

dedupe-1.6.14-cp36-cp36m-macosx_10_11_x86_64.whl (50.1 kB view details)

Uploaded CPython 3.6m macOS 10.11+ x86-64

dedupe-1.6.14-cp35-cp35m-manylinux1_x86_64.whl (74.3 kB view details)

Uploaded CPython 3.5m

dedupe-1.6.14-cp35-cp35m-manylinux1_i686.whl (71.0 kB view details)

Uploaded CPython 3.5m

dedupe-1.6.14-cp34-cp34m-win_amd64.whl (50.9 kB view details)

Uploaded CPython 3.4m Windows x86-64

dedupe-1.6.14-cp34-cp34m-win32.whl (50.1 kB view details)

Uploaded CPython 3.4m Windows x86

dedupe-1.6.14-cp34-cp34m-manylinux1_x86_64.whl (74.4 kB view details)

Uploaded CPython 3.4m

dedupe-1.6.14-cp34-cp34m-manylinux1_i686.whl (71.2 kB view details)

Uploaded CPython 3.4m

dedupe-1.6.14-cp27-cp27mu-manylinux1_x86_64.whl (72.2 kB view details)

Uploaded CPython 2.7mu

dedupe-1.6.14-cp27-cp27mu-manylinux1_i686.whl (69.5 kB view details)

Uploaded CPython 2.7mu

dedupe-1.6.14-cp27-cp27m-win_amd64.whl (51.0 kB view details)

Uploaded CPython 2.7m Windows x86-64

dedupe-1.6.14-cp27-cp27m-win32.whl (50.1 kB view details)

Uploaded CPython 2.7m Windows x86

dedupe-1.6.14-cp27-cp27m-manylinux1_x86_64.whl (72.1 kB view details)

Uploaded CPython 2.7m

dedupe-1.6.14-cp27-cp27m-manylinux1_i686.whl (69.5 kB view details)

Uploaded CPython 2.7m

dedupe-1.6.14-cp27-cp27m-macosx_10_11_x86_64.whl (49.8 kB view details)

Uploaded CPython 2.7m macOS 10.11+ x86-64

File details

Details for the file dedupe-1.6.14.tar.gz.

File metadata

  • Download URL: dedupe-1.6.14.tar.gz
  • Upload date:
  • Size: 48.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for dedupe-1.6.14.tar.gz
Algorithm Hash digest
SHA256 5ebb16714c8829eaa61e01293d9ca96c2004362117b98c7f7e57ee61d8a309a9
MD5 c0ec73ddb93f296e74172f636c21f207
BLAKE2b-256 28d6fab88ffafa9aa014080b0bfcac9f8d400b0e1d34d252ecb5b7981f536116

See more details on using hashes here.

File details

Details for the file dedupe-1.6.14-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.14-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 8c902dadc4d1b707d6c1ba799e5bb0a08dacac3409fc57ae1e9e1b49b2cf75fc
MD5 5d6bda7601a93a2fba3413f7d5f9c76b
BLAKE2b-256 2e074555259e53e46fc3c19bb30b3c1769e91d1de288d8f1d9019c5589bb3f46

See more details on using hashes here.

File details

Details for the file dedupe-1.6.14-cp36-cp36m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.6.14-cp36-cp36m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 1ca1589cd5680d6ab85958175fa2e36c4f64c848dcd42d84dbd620c54d13835d
MD5 af64770837135898d79ad5605aec30e9
BLAKE2b-256 b54f7315db9c63822928369c3018c337c711f187ec5de31583ee2eb466f38811

See more details on using hashes here.

File details

Details for the file dedupe-1.6.14-cp36-cp36m-macosx_10_11_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.14-cp36-cp36m-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 0b62afc28d069a47f8570808eaaa7ac203d57c3638879f4f2f708fb453eeed46
MD5 cda191db0c6e78f120653262fd7e719a
BLAKE2b-256 aac4c9d9de82927152444dc44c6937b67278db71f6427485e7187697de02544d

See more details on using hashes here.

File details

Details for the file dedupe-1.6.14-cp35-cp35m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.14-cp35-cp35m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 22ce1471f0e051c6c96ed825a36dfb137f5550d268d2f7396c9f9b28e07d0173
MD5 0dc1b824dfc352dd6679c19f8efc53b1
BLAKE2b-256 4b241fdc8e78c727038b91d86f14c792ce2cfcde3635286cdcd751b1c90f4d83

See more details on using hashes here.

File details

Details for the file dedupe-1.6.14-cp35-cp35m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.6.14-cp35-cp35m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 fbad47094802c86c6d9a28788ec9beb396d5a6fd3260dafbaf1cc50173351a84
MD5 0d8bd4ff92bce9f7f2658fea78f58058
BLAKE2b-256 a72dd40c20378cb6759af1eb4f8dab5304fdf1f986e4d3a2a7e4ccd32965cc2b

See more details on using hashes here.

File details

Details for the file dedupe-1.6.14-cp34-cp34m-win_amd64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.14-cp34-cp34m-win_amd64.whl
Algorithm Hash digest
SHA256 101cf3f16f65ea10b4f4e1c2f5cde35e10e7662e9026b2c596c696959c767626
MD5 db70d0b2118c14abf9a9986e8f48acdc
BLAKE2b-256 e82d74eb9bf61ac3df84f9eb4d0058d99304ebbb8f9b3f6c76ea9f6be183ef00

See more details on using hashes here.

File details

Details for the file dedupe-1.6.14-cp34-cp34m-win32.whl.

File metadata

File hashes

Hashes for dedupe-1.6.14-cp34-cp34m-win32.whl
Algorithm Hash digest
SHA256 22a8589b319e9899cdf44240dbf12193cd14d40f1ff1c0bb741f7ee4bda5a3e6
MD5 02bef0a3a0bd8f3ccedc70ba11d2f498
BLAKE2b-256 817375054322a37ba670bda52a9ffcd820e924936f15621a96d40df91750fac7

See more details on using hashes here.

File details

Details for the file dedupe-1.6.14-cp34-cp34m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.14-cp34-cp34m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 c1b98049ffc04ecf408265e5fb3ac419d134da9c9767cf49b911822c140ea398
MD5 cf699794531d61eed130917823f725a2
BLAKE2b-256 9ba2506ee217b1b810178eee73b520a86c0776161216d7a6559195a0a7eca61a

See more details on using hashes here.

File details

Details for the file dedupe-1.6.14-cp34-cp34m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.6.14-cp34-cp34m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 dd62f5fc81951d53c3de6e45da4caf918a96ef4354b0f19724ac90aed04f85f0
MD5 0a637db898273f3f77d0588bfafce499
BLAKE2b-256 c1a2f8ab0e044361c0081b56ea1213209fc3ae054d0d1132653d2004c41b7ba0

See more details on using hashes here.

File details

Details for the file dedupe-1.6.14-cp27-cp27mu-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.14-cp27-cp27mu-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 80ce0a691f2d210d4ddf50a3212f09dd5098e0d347f9bef7febb5aef939b1c77
MD5 965568d6df1a4faa9c89992708ed353d
BLAKE2b-256 4f20ac4021795b4b379069363af6cdd8ee131e501824d9244bd746e9afcc4e84

See more details on using hashes here.

File details

Details for the file dedupe-1.6.14-cp27-cp27mu-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.6.14-cp27-cp27mu-manylinux1_i686.whl
Algorithm Hash digest
SHA256 e7b9aaf5070870f7fe7f273e67981ce0a891aceef6c212291b517503406f7733
MD5 9217594e2aa91ad08633383e3189fcc2
BLAKE2b-256 e7d322309b8ba0035034958f4be7714d93bef33f0926e30c26c2be1c4f4fa914

See more details on using hashes here.

File details

Details for the file dedupe-1.6.14-cp27-cp27m-win_amd64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.14-cp27-cp27m-win_amd64.whl
Algorithm Hash digest
SHA256 1859715cf755d8716754e6588c0e565958e4dcf281880ca157049ddb0399dc5f
MD5 142b1869704d006eedc7baccc690b573
BLAKE2b-256 1c3130241da7368b3767cad8db0d1519610abadbfaac5a2a26991761a72cef03

See more details on using hashes here.

File details

Details for the file dedupe-1.6.14-cp27-cp27m-win32.whl.

File metadata

File hashes

Hashes for dedupe-1.6.14-cp27-cp27m-win32.whl
Algorithm Hash digest
SHA256 2c7a5e3da15460c13c10790d0d2c8949c41e1858692e839042e26ad6ba3035c1
MD5 87f2e3bf6c994ad68ef02403a1de1c11
BLAKE2b-256 fb5756c3c014aa3102fc9fe5a861e482baedb24176cccdaad1faab34a79a99b4

See more details on using hashes here.

File details

Details for the file dedupe-1.6.14-cp27-cp27m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.14-cp27-cp27m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 62a5012dd35d35305706d4e2b8b58b36f6ea210373481ed554d0e22aec4c96fa
MD5 82bd422ba08facc3dfc03e7ada94938f
BLAKE2b-256 9f5e442a4178649ce0fe935f5d7dc20204d8ed624d5fcbbdabf3a10e079c01c9

See more details on using hashes here.

File details

Details for the file dedupe-1.6.14-cp27-cp27m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.6.14-cp27-cp27m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 112d448c44b957811d7a8913c3c3c27a47deaaf5dacdaf661c832f651b4166e3
MD5 8b4e6aed336e7bbb8e47a80277809207
BLAKE2b-256 0a7a685140ea4eb24409ec4af63fabc62865d5a15abec6cf4cff433a108b5e5b

See more details on using hashes here.

File details

Details for the file dedupe-1.6.14-cp27-cp27m-macosx_10_11_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.14-cp27-cp27m-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 ceb5687f3ac7faa6106f77bd12d7e21b64d5cd2867487e5ba7e2d4a53928fc0a
MD5 41cbf07d9ee27a52dd8c34dd52368dcc
BLAKE2b-256 827d95de5ef552d542424023c755592ba02d667f980b04ab4493d0380f194e83

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page