Skip to main content

A python library for accurate and scaleable data deduplication and entity-resolution

Project description

dedupe is a library that uses machine learning to perform de-duplication and entity resolution quickly on structured data. dedupe is the open source engine for dedupe.io

dedupe will help you:

  • remove duplicate entries from a spreadsheet of names and addresses

  • link a list with customer information to another with order history, even without unique customer id’s

  • take a database of campaign contributions and figure out which ones were made by the same person, even if the names were entered slightly differently for each record

dedupe takes in human training data and comes up with the best rules for your dataset to quickly and automatically find similar records, even with very large databases.

Important links:

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dedupe-1.6.8.tar.gz (47.5 kB view details)

Uploaded Source

Built Distributions

dedupe-1.6.8-cp36-cp36m-manylinux1_x86_64.whl (74.1 kB view details)

Uploaded CPython 3.6m

dedupe-1.6.8-cp36-cp36m-manylinux1_i686.whl (70.8 kB view details)

Uploaded CPython 3.6m

dedupe-1.6.8-cp36-cp36m-macosx_10_11_x86_64.whl (49.8 kB view details)

Uploaded CPython 3.6m macOS 10.11+ x86-64

dedupe-1.6.8-cp35-cp35m-manylinux1_x86_64.whl (73.9 kB view details)

Uploaded CPython 3.5m

dedupe-1.6.8-cp35-cp35m-manylinux1_i686.whl (70.6 kB view details)

Uploaded CPython 3.5m

dedupe-1.6.8-cp34-cp34m-win_amd64.whl (50.5 kB view details)

Uploaded CPython 3.4m Windows x86-64

dedupe-1.6.8-cp34-cp34m-win32.whl (49.8 kB view details)

Uploaded CPython 3.4m Windows x86

dedupe-1.6.8-cp34-cp34m-manylinux1_x86_64.whl (74.0 kB view details)

Uploaded CPython 3.4m

dedupe-1.6.8-cp34-cp34m-manylinux1_i686.whl (70.8 kB view details)

Uploaded CPython 3.4m

dedupe-1.6.8-cp27-cp27mu-manylinux1_x86_64.whl (71.8 kB view details)

Uploaded CPython 2.7mu

dedupe-1.6.8-cp27-cp27mu-manylinux1_i686.whl (69.1 kB view details)

Uploaded CPython 2.7mu

dedupe-1.6.8-cp27-cp27m-win_amd64.whl (50.6 kB view details)

Uploaded CPython 2.7m Windows x86-64

dedupe-1.6.8-cp27-cp27m-win32.whl (49.7 kB view details)

Uploaded CPython 2.7m Windows x86

dedupe-1.6.8-cp27-cp27m-manylinux1_x86_64.whl (71.7 kB view details)

Uploaded CPython 2.7m

dedupe-1.6.8-cp27-cp27m-manylinux1_i686.whl (69.1 kB view details)

Uploaded CPython 2.7m

dedupe-1.6.8-cp27-cp27m-macosx_10_11_x86_64.whl (49.4 kB view details)

Uploaded CPython 2.7m macOS 10.11+ x86-64

File details

Details for the file dedupe-1.6.8.tar.gz.

File metadata

  • Download URL: dedupe-1.6.8.tar.gz
  • Upload date:
  • Size: 47.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for dedupe-1.6.8.tar.gz
Algorithm Hash digest
SHA256 52df1fe73ef32211379be33bd72ed4d169c881d61eb3121dc52c8dbf24ec2e51
MD5 2b6292db3aa6258d9b8df64334956b0e
BLAKE2b-256 c82322cbd4b0afa7aef943ee57c10f70b1f2e6cf3873c40b43f2379cdfcc5b43

See more details on using hashes here.

File details

Details for the file dedupe-1.6.8-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.8-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 ea84dc982a8e92f16de40179731aec45b2ee97c573dce43e6caeb54523073311
MD5 ba617f73963cdd307dd08da48f1061a7
BLAKE2b-256 0a351ad31899715e3e2e10001b62f4a7f6dcb559059145996f4838a98d25d281

See more details on using hashes here.

File details

Details for the file dedupe-1.6.8-cp36-cp36m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.6.8-cp36-cp36m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 14f7cb10f74f711c0c19c9e9118291de9ecfd8508f82586816946fad5f52a21f
MD5 d0ea8ed608f5883f5d5e6e1c7a8a527b
BLAKE2b-256 d9b5238d97bf14b71fbce71ce6a6ea458f29bdc3009770fbd227ef59095734b2

See more details on using hashes here.

File details

Details for the file dedupe-1.6.8-cp36-cp36m-macosx_10_11_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.8-cp36-cp36m-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 a3ddca28e5306f929480f41fb1709bcba97278657952fdb2e419900fb5848f56
MD5 7f40184a0b2594a15ccb72253db27e0c
BLAKE2b-256 f6ff744b4b2cc8d25d521ed4e6889786cd8b98ab553e12dfa6f30b5c55b48076

See more details on using hashes here.

File details

Details for the file dedupe-1.6.8-cp35-cp35m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.8-cp35-cp35m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 7991badd75b7a52f7ea2e6f8dc94ad7c157f62d4f24ab05b823438ac5b649674
MD5 a81feb800473ddae083020e2e84b2325
BLAKE2b-256 5c1db60a4d5a7c6c6d2920aace3d43210da20327f44b9ae349a9d3cee5476e83

See more details on using hashes here.

File details

Details for the file dedupe-1.6.8-cp35-cp35m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.6.8-cp35-cp35m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 f01f65cd6b49d055464f11d144852cc0df8602e63323b411c017fbe31fdfc09e
MD5 1ed5ec7c7d5154b1339953f3cf673adf
BLAKE2b-256 71a11c5463e351872ba52420c0ba7d7a80f61cc5234c1099802d8c5f981111b4

See more details on using hashes here.

File details

Details for the file dedupe-1.6.8-cp34-cp34m-win_amd64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.8-cp34-cp34m-win_amd64.whl
Algorithm Hash digest
SHA256 e7fcf302728a87c72b4f4fe8118660c7931f7a4b35941ff5fc713f7e89a0a65c
MD5 2e542db599161d9a81c9af1b3b8b685c
BLAKE2b-256 8e31a3535eace537e94e13a99e5a97e9b01c34566ae5e34c3e4ab9ac76c912b6

See more details on using hashes here.

File details

Details for the file dedupe-1.6.8-cp34-cp34m-win32.whl.

File metadata

File hashes

Hashes for dedupe-1.6.8-cp34-cp34m-win32.whl
Algorithm Hash digest
SHA256 24c0c5f10d63cf176b84b1802e81f4d783c0fc8120e8fb4eecb059e9a4b8e1c3
MD5 7d740bb8f41401e122104a598a803f94
BLAKE2b-256 10d597ca5a76637b79cfd23a764ac0b75109c2b1b562e6e3e7233e5e240bfd55

See more details on using hashes here.

File details

Details for the file dedupe-1.6.8-cp34-cp34m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.8-cp34-cp34m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 77e2de2c302be5eec5cec6545a91abda6bd7e8b9cc33b0c3ea930bae639ce310
MD5 32915c8414aac815fccc5707f85524fc
BLAKE2b-256 52ecdfb295b0a834640134e756f9465196f5262d145030a6af725bc541288c96

See more details on using hashes here.

File details

Details for the file dedupe-1.6.8-cp34-cp34m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.6.8-cp34-cp34m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 72c314330039b085710c0776c3f0d330aa480048dec102b6a2a72d4fc4eac2c8
MD5 d4fe1c4c68b55d5b6be2af9d90737c4a
BLAKE2b-256 92afb61ac77a90fea5ec215c88457ad1ee93b5c5b7274e8bd34e2de3d6d8559b

See more details on using hashes here.

File details

Details for the file dedupe-1.6.8-cp27-cp27mu-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.8-cp27-cp27mu-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 6bbb0488368a938e2258f53252282833c8580fb186be9041b2ac2fac7e816ea6
MD5 55cfc0b7476df3b8cf78835e40169b90
BLAKE2b-256 7e1d10bd059d0261d478b33195304859b2c3b40199b8d098d0571bc354d4e91b

See more details on using hashes here.

File details

Details for the file dedupe-1.6.8-cp27-cp27mu-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.6.8-cp27-cp27mu-manylinux1_i686.whl
Algorithm Hash digest
SHA256 6d3c54e3863faf610f5f3a3ee5651c95972c80f62c8ce277f1da566fbe34b11c
MD5 78573488e8221e189d6a4493d93dc68c
BLAKE2b-256 be5de22210d3812f90c4fbaeb588b65be40bbcaa8f284b9421122245574cf07f

See more details on using hashes here.

File details

Details for the file dedupe-1.6.8-cp27-cp27m-win_amd64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.8-cp27-cp27m-win_amd64.whl
Algorithm Hash digest
SHA256 b2fb0679d7c645c1db37eb8f3680599d8f38cde39dad62259fc14c36e34aa933
MD5 8f97274731701112567462dfa295ef80
BLAKE2b-256 e82028a6f0839497552aff2eb8f671623f93f6d867270a91142fecd85570b83a

See more details on using hashes here.

File details

Details for the file dedupe-1.6.8-cp27-cp27m-win32.whl.

File metadata

File hashes

Hashes for dedupe-1.6.8-cp27-cp27m-win32.whl
Algorithm Hash digest
SHA256 1865bf25de5f307654298e906cd91c704d93abf32f118f890a83a36d72e94ac9
MD5 edd04148676eaba2144080c07703fc06
BLAKE2b-256 3f0100347c3cc679731d363b3c0c165c1273f16bc724107ab6c2d1edea8fc571

See more details on using hashes here.

File details

Details for the file dedupe-1.6.8-cp27-cp27m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.8-cp27-cp27m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 31846783b86bd9e4ffbb1644b9666e2a674d8ceb4e66793715dbb6fe79a3b16d
MD5 1727619f19dc4a48cec1f6b73e0ed92f
BLAKE2b-256 3ad2d9264e0e303e59d5b3f431b1189d1088076c33432735b89e2e375aa48359

See more details on using hashes here.

File details

Details for the file dedupe-1.6.8-cp27-cp27m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.6.8-cp27-cp27m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 ea5a5fc16d4a8f246bf8c5c6447a2608299a28bdc125415d0cd1400751d4e3fc
MD5 96f68e2e388699c3ee076d53fe41a40c
BLAKE2b-256 42bb6ee24afd848cad4cbc907bdafed36c8a935339db39d5186bf9a4d424f412

See more details on using hashes here.

File details

Details for the file dedupe-1.6.8-cp27-cp27m-macosx_10_11_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.8-cp27-cp27m-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 1ca31705578a314c800a8706e00f46fe6cc9240566eba91a65785943bc00b237
MD5 e675dfb69ce8f050a60c11807ff96221
BLAKE2b-256 2a5f4537c3aa0d16880d46688a6aea3839d57f9943d331d9368d474360d390b1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page