Skip to main content

A python library for accurate and scaleable data deduplication and entity-resolution

Project description

dedupe is a library that uses machine learning to perform de-duplication and entity resolution quickly on structured data. dedupe is the open source engine for dedupe.io

dedupe will help you:

  • remove duplicate entries from a spreadsheet of names and addresses

  • link a list with customer information to another with order history, even without unique customer id’s

  • take a database of campaign contributions and figure out which ones were made by the same person, even if the names were entered slightly differently for each record

dedupe takes in human training data and comes up with the best rules for your dataset to quickly and automatically find similar records, even with very large databases.

Important links:

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dedupe-1.6.15.tar.gz (48.7 kB view details)

Uploaded Source

Built Distributions

dedupe-1.6.15-cp36-cp36m-manylinux1_x86_64.whl (74.5 kB view details)

Uploaded CPython 3.6m

dedupe-1.6.15-cp36-cp36m-manylinux1_i686.whl (71.2 kB view details)

Uploaded CPython 3.6m

dedupe-1.6.15-cp36-cp36m-macosx_10_11_x86_64.whl (50.2 kB view details)

Uploaded CPython 3.6m macOS 10.11+ x86-64

dedupe-1.6.15-cp35-cp35m-manylinux1_x86_64.whl (74.3 kB view details)

Uploaded CPython 3.5m

dedupe-1.6.15-cp35-cp35m-manylinux1_i686.whl (71.0 kB view details)

Uploaded CPython 3.5m

dedupe-1.6.15-cp34-cp34m-win_amd64.whl (50.9 kB view details)

Uploaded CPython 3.4m Windows x86-64

dedupe-1.6.15-cp34-cp34m-win32.whl (50.2 kB view details)

Uploaded CPython 3.4m Windows x86

dedupe-1.6.15-cp34-cp34m-manylinux1_x86_64.whl (74.5 kB view details)

Uploaded CPython 3.4m

dedupe-1.6.15-cp34-cp34m-manylinux1_i686.whl (71.2 kB view details)

Uploaded CPython 3.4m

dedupe-1.6.15-cp27-cp27mu-manylinux1_x86_64.whl (72.2 kB view details)

Uploaded CPython 2.7mu

dedupe-1.6.15-cp27-cp27mu-manylinux1_i686.whl (69.5 kB view details)

Uploaded CPython 2.7mu

dedupe-1.6.15-cp27-cp27m-win_amd64.whl (51.0 kB view details)

Uploaded CPython 2.7m Windows x86-64

dedupe-1.6.15-cp27-cp27m-win32.whl (50.2 kB view details)

Uploaded CPython 2.7m Windows x86

dedupe-1.6.15-cp27-cp27m-manylinux1_x86_64.whl (72.1 kB view details)

Uploaded CPython 2.7m

dedupe-1.6.15-cp27-cp27m-manylinux1_i686.whl (69.5 kB view details)

Uploaded CPython 2.7m

dedupe-1.6.15-cp27-cp27m-macosx_10_11_x86_64.whl (49.8 kB view details)

Uploaded CPython 2.7m macOS 10.11+ x86-64

File details

Details for the file dedupe-1.6.15.tar.gz.

File metadata

  • Download URL: dedupe-1.6.15.tar.gz
  • Upload date:
  • Size: 48.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for dedupe-1.6.15.tar.gz
Algorithm Hash digest
SHA256 3572d933cc5e2b7c98a2978bfaad25d6c0aa604b9723d5ac84eb7b85898bbac3
MD5 358b89b9457c8f8cf6e4520f84b91f9c
BLAKE2b-256 ea39b54047baa9a4a6204188ae088839888216b6203d771c6f1406a422edb71f

See more details on using hashes here.

File details

Details for the file dedupe-1.6.15-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.15-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 0eb2551b860d2d83c9cf89f1a3cad13f4160b1f196a379ff775f60435115cff8
MD5 205d71015a63819377335dd427566f1e
BLAKE2b-256 3d6ebf7deda8b68a10ee5422aa4ebb88837fd8c7f700a972b12d5499811086ed

See more details on using hashes here.

File details

Details for the file dedupe-1.6.15-cp36-cp36m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.6.15-cp36-cp36m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 02058d03addc2cee4430d5a0601b415af1b6fc59cccb22407a9a5496ef2165b7
MD5 c307a3f0c2b79a6918b3ed2e7c14e055
BLAKE2b-256 685b2250335533d146bf327a3470f24c5e5465864299c678f655f1d909ec6a28

See more details on using hashes here.

File details

Details for the file dedupe-1.6.15-cp36-cp36m-macosx_10_11_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.15-cp36-cp36m-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 d88d4d5b5e353822acd47aa2d70410ca1c4a3a302c9a3dcc9b57349abf924d36
MD5 8eee52def2c121e9dae5c01768f137e2
BLAKE2b-256 1c56610a1d69fc2da44c9fdc159c7007ed7d949769ec8dbd434ad4985e1201c2

See more details on using hashes here.

File details

Details for the file dedupe-1.6.15-cp35-cp35m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.15-cp35-cp35m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 0240f5567589d14970b74eaaf100d8cbb52d89e9c6724d4988c596a9ed5004b7
MD5 c70d2bbaa7ddb357f0be0a84021db604
BLAKE2b-256 b33006616a7378b95b3f8b4c7d876fdb9d8e18a08660cf784bfbdb9f78a5f1b7

See more details on using hashes here.

File details

Details for the file dedupe-1.6.15-cp35-cp35m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.6.15-cp35-cp35m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 fcfae280a9508ec6526999dab17f16ff9b9cf09d964ff1d13767a7e93afa5d59
MD5 7bf555aaa024a0ad39e98e9dd1039448
BLAKE2b-256 ed4000ecc93cbdb9ef10335419f1a2fe0901e5512c5849ea2dc4a3ab7cebcbcd

See more details on using hashes here.

File details

Details for the file dedupe-1.6.15-cp34-cp34m-win_amd64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.15-cp34-cp34m-win_amd64.whl
Algorithm Hash digest
SHA256 258d426c990be3375c7a41d700c919e8e821e7f1aafc8e0548f2447e6d7e02d3
MD5 2d8343712fd79cbceb8864f32aca354f
BLAKE2b-256 6a6c9c7f51b9122fabe6a567c9430b7c003019dcdfd3dcdcb4e2a8024c1fa3b8

See more details on using hashes here.

File details

Details for the file dedupe-1.6.15-cp34-cp34m-win32.whl.

File metadata

File hashes

Hashes for dedupe-1.6.15-cp34-cp34m-win32.whl
Algorithm Hash digest
SHA256 7214a091138225edf21101f9e02beaf188e3c1a00645bf01702268884114c5ed
MD5 d9949c32ec6b5321a40c721784575c95
BLAKE2b-256 c01a9ab78af7c3009cc2b6c437d81e87372ecff1db119c1fbdaafbae8b308cc2

See more details on using hashes here.

File details

Details for the file dedupe-1.6.15-cp34-cp34m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.15-cp34-cp34m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 b94f808b9f81193ae08805a64050498c3dbd86ed93b9193f4284c47b71af4b33
MD5 fc0218983fe9a86ba22f21edd2c2a93c
BLAKE2b-256 3415658eea22700faefdaaa39dcb2d00bb008ec7f26d382bf17e0de0f6c9dbe9

See more details on using hashes here.

File details

Details for the file dedupe-1.6.15-cp34-cp34m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.6.15-cp34-cp34m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 2b961b8daaafebd30ba9347d3512493b01e374776aa8a9e1747fbc94d732e62a
MD5 be6572b36c4aa56d3f5647b6572550b2
BLAKE2b-256 6b028e79d03162528893703816f7db3b4ab69173cd60cb0c363429c6737e47ad

See more details on using hashes here.

File details

Details for the file dedupe-1.6.15-cp27-cp27mu-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.15-cp27-cp27mu-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 67649d1d84d10565986a14da163fa2549637ddce364dad48ec1ef67d5e8421ba
MD5 27fde926b2c82fcc75cd1885e2e664f5
BLAKE2b-256 8f89b61d071197091a2e0b1970c6ca85d80d882ca514ddb6904adebb96bf425a

See more details on using hashes here.

File details

Details for the file dedupe-1.6.15-cp27-cp27mu-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.6.15-cp27-cp27mu-manylinux1_i686.whl
Algorithm Hash digest
SHA256 458611d14d421a5de087eca0d1b1d53fda89a9a1b896a77b91525ac7ebff1321
MD5 6d78e0a89dadd6f27c90ce68053e09f6
BLAKE2b-256 6dc607ca2aab55b9cd1924393ab05b9e0fce21a79831c0c04487c6d66a8e2498

See more details on using hashes here.

File details

Details for the file dedupe-1.6.15-cp27-cp27m-win_amd64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.15-cp27-cp27m-win_amd64.whl
Algorithm Hash digest
SHA256 fa9c4812069256424ff20ed22bc172c410f4b5817531890b6864c5a14c327936
MD5 697750cc4e9101920cb5aa681e9cd76a
BLAKE2b-256 2e202828901ca2b37e19eb1c0c705980f9a2e1a838978a5d87e4e411083d73b5

See more details on using hashes here.

File details

Details for the file dedupe-1.6.15-cp27-cp27m-win32.whl.

File metadata

File hashes

Hashes for dedupe-1.6.15-cp27-cp27m-win32.whl
Algorithm Hash digest
SHA256 4a037f3d839acf0adfa55491ff558d7de26d1398d5999f33fc49a38bcb3596f8
MD5 70d4a9b1844422543e73a224c091a3ea
BLAKE2b-256 3a5ba1c3a0564e35736e079c3cfcf3871e960491cbacf87d092c12faed875447

See more details on using hashes here.

File details

Details for the file dedupe-1.6.15-cp27-cp27m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.15-cp27-cp27m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 3792da77bd61e2942ba6d57de114927c89f74c63fd4c1857efa937db41ec1099
MD5 db0741fc0550144972cc974a27f50127
BLAKE2b-256 4cdd84e40c13315c2910a2eda748b2220bca42a77b872a67b6b5b75c134315f0

See more details on using hashes here.

File details

Details for the file dedupe-1.6.15-cp27-cp27m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.6.15-cp27-cp27m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 069bca91e925c36d1649096367dd4c8689f126eb34fe954fa3e5f03df9d16e2c
MD5 b41bdc4fd9bf04328c4dc38a626e2ed1
BLAKE2b-256 7bb9d71e4783e0f570a6bf20fe99d9ea8ca4f77262a1068f9f775f44ed5fe41a

See more details on using hashes here.

File details

Details for the file dedupe-1.6.15-cp27-cp27m-macosx_10_11_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.15-cp27-cp27m-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 19c7e382320c228ac430d8aa2d9c0764c8aff8fcbca23ff132350785421d8fd3
MD5 ab1bf2ecd35aabc79a5482cba9143674
BLAKE2b-256 d5a5ed6393f195f84d99ff134984f265c24c0feee3af0dc5f70e0b33deba30be

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page