Skip to main content

A python library for accurate and scaleable data deduplication and entity-resolution

Project description

dedupe is a library that uses machine learning to perform de-duplication and entity resolution quickly on structured data. dedupe is the open source engine for dedupe.io

dedupe will help you:

  • remove duplicate entries from a spreadsheet of names and addresses

  • link a list with customer information to another with order history, even without unique customer id’s

  • take a database of campaign contributions and figure out which ones were made by the same person, even if the names were entered slightly differently for each record

dedupe takes in human training data and comes up with the best rules for your dataset to quickly and automatically find similar records, even with very large databases.

Important links:

Project details


Release history Release notifications | RSS feed

This version

1.7.0

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dedupe-1.7.0.tar.gz (48.7 kB view details)

Uploaded Source

Built Distributions

dedupe-1.7.0-cp36-cp36m-manylinux1_x86_64.whl (74.5 kB view details)

Uploaded CPython 3.6m

dedupe-1.7.0-cp36-cp36m-manylinux1_i686.whl (71.3 kB view details)

Uploaded CPython 3.6m

dedupe-1.7.0-cp36-cp36m-macosx_10_11_x86_64.whl (50.2 kB view details)

Uploaded CPython 3.6m macOS 10.11+ x86-64

dedupe-1.7.0-cp35-cp35m-manylinux1_x86_64.whl (74.3 kB view details)

Uploaded CPython 3.5m

dedupe-1.7.0-cp35-cp35m-manylinux1_i686.whl (71.0 kB view details)

Uploaded CPython 3.5m

dedupe-1.7.0-cp34-cp34m-win_amd64.whl (50.9 kB view details)

Uploaded CPython 3.4m Windows x86-64

dedupe-1.7.0-cp34-cp34m-win32.whl (50.2 kB view details)

Uploaded CPython 3.4m Windows x86

dedupe-1.7.0-cp34-cp34m-manylinux1_x86_64.whl (74.5 kB view details)

Uploaded CPython 3.4m

dedupe-1.7.0-cp34-cp34m-manylinux1_i686.whl (71.2 kB view details)

Uploaded CPython 3.4m

dedupe-1.7.0-cp27-cp27mu-manylinux1_x86_64.whl (72.2 kB view details)

Uploaded CPython 2.7mu

dedupe-1.7.0-cp27-cp27mu-manylinux1_i686.whl (69.6 kB view details)

Uploaded CPython 2.7mu

dedupe-1.7.0-cp27-cp27m-win_amd64.whl (51.0 kB view details)

Uploaded CPython 2.7m Windows x86-64

dedupe-1.7.0-cp27-cp27m-win32.whl (50.2 kB view details)

Uploaded CPython 2.7m Windows x86

dedupe-1.7.0-cp27-cp27m-manylinux1_x86_64.whl (72.2 kB view details)

Uploaded CPython 2.7m

dedupe-1.7.0-cp27-cp27m-manylinux1_i686.whl (69.6 kB view details)

Uploaded CPython 2.7m

dedupe-1.7.0-cp27-cp27m-macosx_10_11_x86_64.whl (49.8 kB view details)

Uploaded CPython 2.7m macOS 10.11+ x86-64

File details

Details for the file dedupe-1.7.0.tar.gz.

File metadata

  • Download URL: dedupe-1.7.0.tar.gz
  • Upload date:
  • Size: 48.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for dedupe-1.7.0.tar.gz
Algorithm Hash digest
SHA256 48c20e0b886e903675cabddc3cf92853469aefbdc47814f9f342e427e5409185
MD5 db87a37e6012b3cfc906a0af1c4f633b
BLAKE2b-256 06c3c7f96459a930f63b991339fa8b9e93c029bc7288522b7a2b851c0977f107

See more details on using hashes here.

File details

Details for the file dedupe-1.7.0-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.0-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 de6e5e44d1c7c71be204023d034f8aeafe3ef599ee5f29910c3b017b5c269caf
MD5 39098ae0da7efcb6616e832a2b15a299
BLAKE2b-256 e000a0315ab4660284c5f996df325192b39268216976632015b6259b39079d62

See more details on using hashes here.

File details

Details for the file dedupe-1.7.0-cp36-cp36m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.7.0-cp36-cp36m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 b50374130ae67aa92c4975726488c49199dcc3def2159550d782374424001d26
MD5 b2fabe5edc3d255093672ed5cc769460
BLAKE2b-256 bfb9accee0801c56a2749499af1bad719e19634fb8597d3fd0aa6bc7f2c21406

See more details on using hashes here.

File details

Details for the file dedupe-1.7.0-cp36-cp36m-macosx_10_11_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.0-cp36-cp36m-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 431bc40cbaba1af40ebcd11d887e31199c5db82713cf6a617054e55643da0315
MD5 38210badddd042048b17728d3e5d1fd5
BLAKE2b-256 13067332412b1819a83f4722e29eee36f501a13b16221b01f81aa3f7f71a99bc

See more details on using hashes here.

File details

Details for the file dedupe-1.7.0-cp35-cp35m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.0-cp35-cp35m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 14867c705f71447a0ce7d3a8b0764c415ea0aa23d03033b21eba1eee8ec81ccd
MD5 a2e79d51add78b2a82c98e76c8102766
BLAKE2b-256 c0218df891069bba358bd9a5602dae3918c14a060b0f29af9d00266faad6661c

See more details on using hashes here.

File details

Details for the file dedupe-1.7.0-cp35-cp35m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.7.0-cp35-cp35m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 3023b55629e69c2def61a18910765ba920651a7c47e9130ea9b0512dd90e34e1
MD5 3a56dbf1506487437d7b1a1caa622f5f
BLAKE2b-256 2842e499c228f7cea3a6b2d66e257385a883c1663bf9ae3a2da91fbb593b1c84

See more details on using hashes here.

File details

Details for the file dedupe-1.7.0-cp34-cp34m-win_amd64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.0-cp34-cp34m-win_amd64.whl
Algorithm Hash digest
SHA256 47b550611a178964553e3a389dd185d0e75c1d0b4622f5ade99d4880e397bdb1
MD5 1b8a91e9368ede80b53530bd1525d998
BLAKE2b-256 be2396e7c706c241f461e9f2c568dc0f532e3f06dd0b695ec59e4f145f52a258

See more details on using hashes here.

File details

Details for the file dedupe-1.7.0-cp34-cp34m-win32.whl.

File metadata

File hashes

Hashes for dedupe-1.7.0-cp34-cp34m-win32.whl
Algorithm Hash digest
SHA256 34199ff18b8d3a262da90f56e535b7b2ca432369aafa1ff9f772e30b602ba48c
MD5 42e564fc1e78c4e5f281f128c9b6c1cb
BLAKE2b-256 2795bf4946a9b8534f1a54385d653a940e349d78e04b3a756c4d28ef63a1f0e4

See more details on using hashes here.

File details

Details for the file dedupe-1.7.0-cp34-cp34m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.0-cp34-cp34m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 ed808d98e6a85bd5b7c362428ee8d6aea974f3d5c77744351f8748b07eff7e55
MD5 664db5f2a3ea69c608236eb70a7446da
BLAKE2b-256 412a24ec215820574d120fb6658bfa74221d8d36fdd71b4139ebd07ff62d61b7

See more details on using hashes here.

File details

Details for the file dedupe-1.7.0-cp34-cp34m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.7.0-cp34-cp34m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 dfaa5855db83e500520e1b03a40ee3c285f86cc36520ae8857f0066c3481f030
MD5 122955097660fdff385aff22e3973a86
BLAKE2b-256 54bdc0ad286393bf858216efbbc38645b70c00e06255a42cffa5f5e24f1f9462

See more details on using hashes here.

File details

Details for the file dedupe-1.7.0-cp27-cp27mu-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.0-cp27-cp27mu-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 92bf1ef22024d6813a5c3c3a069b1d9567fe1e49a6479982798aa78b53ee118c
MD5 d131044985a13f1346e4003f2335217a
BLAKE2b-256 85decd01f12c5077c2b0d8c3ab5ea12c8a835082313bc6db81cfc14cabf06343

See more details on using hashes here.

File details

Details for the file dedupe-1.7.0-cp27-cp27mu-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.7.0-cp27-cp27mu-manylinux1_i686.whl
Algorithm Hash digest
SHA256 09586d2dd9cf4609cd5c7f23a34562adfe9c6d4acfb1f6af68b242c73d6b5070
MD5 163e31345157ab077f779c375797d23b
BLAKE2b-256 5f8869425fd6306ac197fb986b3e85a103750c6898612546e9984a593825a01a

See more details on using hashes here.

File details

Details for the file dedupe-1.7.0-cp27-cp27m-win_amd64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.0-cp27-cp27m-win_amd64.whl
Algorithm Hash digest
SHA256 bcd35723689ca6aa79e26541b0e7e29ba01b382dbe39a91a6fc318e210e43fec
MD5 6dfd3e40953af54120e7d69d6841c7ba
BLAKE2b-256 27be58737ac6b90d89f1c4ffdd07d64267e25c6bf910830e26bc5b3b2a584d14

See more details on using hashes here.

File details

Details for the file dedupe-1.7.0-cp27-cp27m-win32.whl.

File metadata

File hashes

Hashes for dedupe-1.7.0-cp27-cp27m-win32.whl
Algorithm Hash digest
SHA256 9e2ddd9212420081c9f9d41f105290b1de6887b37c6aa9db83a920d4296943d1
MD5 b96969bd08b7bc796350e1d71b0e9c3b
BLAKE2b-256 4f119c191ec6cad898f84f7eae38d9ad7accc55a862d1d78f92008a357a6be66

See more details on using hashes here.

File details

Details for the file dedupe-1.7.0-cp27-cp27m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.0-cp27-cp27m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 b37fa6e2e35fc4feb4884daa69c0e2baa7c9ce3996df4680cc6539cd036de4ac
MD5 276152de28c234ba6585c5bd795831d6
BLAKE2b-256 dca09f174e3a637633190f00aaf591a2fc76db223b1919f80eb5ef188cd0a9c4

See more details on using hashes here.

File details

Details for the file dedupe-1.7.0-cp27-cp27m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.7.0-cp27-cp27m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 8c612a8b8ac4a272925fcf49b8871a560a94eb09694c6490339fc115c5044523
MD5 960d6da233f83f99bda775ee59b8b058
BLAKE2b-256 925c0ca2cb2aa0aeec2742f3c4ed7cb7ad428fb96e2a55ff7b70dcd76be0b68e

See more details on using hashes here.

File details

Details for the file dedupe-1.7.0-cp27-cp27m-macosx_10_11_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.0-cp27-cp27m-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 b8653b143b75ecbbd01d6834f93864636163ac42c2d9447fe3ac9adddc4e3f32
MD5 99348b1045f8f53e854f07097021a104
BLAKE2b-256 f64dfd9fc05640388ad0fddebbc567e7391d6ec43821a358239f91845a8e49bb

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page