Skip to main content

A python library for accurate and scaleable data deduplication and entity-resolution

Project description

dedupe is a library that uses machine learning to perform de-duplication and entity resolution quickly on structured data. dedupe is the open source engine for dedupe.io

dedupe will help you:

  • remove duplicate entries from a spreadsheet of names and addresses

  • link a list with customer information to another with order history, even without unique customer id’s

  • take a database of campaign contributions and figure out which ones were made by the same person, even if the names were entered slightly differently for each record

dedupe takes in human training data and comes up with the best rules for your dataset to quickly and automatically find similar records, even with very large databases.

Important links:

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dedupe-1.7.8.tar.gz (52.1 kB view details)

Uploaded Source

Built Distributions

dedupe-1.7.8-cp36-cp36m-manylinux1_x86_64.whl (76.9 kB view details)

Uploaded CPython 3.6m

dedupe-1.7.8-cp36-cp36m-manylinux1_i686.whl (73.6 kB view details)

Uploaded CPython 3.6m

dedupe-1.7.8-cp36-cp36m-macosx_10_11_x86_64.whl (51.8 kB view details)

Uploaded CPython 3.6m macOS 10.11+ x86-64

dedupe-1.7.8-cp35-cp35m-manylinux1_x86_64.whl (76.7 kB view details)

Uploaded CPython 3.5m

dedupe-1.7.8-cp35-cp35m-manylinux1_i686.whl (73.4 kB view details)

Uploaded CPython 3.5m

dedupe-1.7.8-cp34-cp34m-win_amd64.whl (52.4 kB view details)

Uploaded CPython 3.4m Windows x86-64

dedupe-1.7.8-cp34-cp34m-win32.whl (51.7 kB view details)

Uploaded CPython 3.4m Windows x86

dedupe-1.7.8-cp34-cp34m-manylinux1_x86_64.whl (76.9 kB view details)

Uploaded CPython 3.4m

dedupe-1.7.8-cp34-cp34m-manylinux1_i686.whl (73.5 kB view details)

Uploaded CPython 3.4m

dedupe-1.7.8-cp27-cp27mu-manylinux1_x86_64.whl (74.6 kB view details)

Uploaded CPython 2.7mu

dedupe-1.7.8-cp27-cp27mu-manylinux1_i686.whl (71.9 kB view details)

Uploaded CPython 2.7mu

dedupe-1.7.8-cp27-cp27m-win_amd64.whl (52.5 kB view details)

Uploaded CPython 2.7m Windows x86-64

dedupe-1.7.8-cp27-cp27m-win32.whl (51.7 kB view details)

Uploaded CPython 2.7m Windows x86

dedupe-1.7.8-cp27-cp27m-manylinux1_x86_64.whl (74.6 kB view details)

Uploaded CPython 2.7m

dedupe-1.7.8-cp27-cp27m-manylinux1_i686.whl (71.9 kB view details)

Uploaded CPython 2.7m

dedupe-1.7.8-cp27-cp27m-macosx_10_11_x86_64.whl (51.5 kB view details)

Uploaded CPython 2.7m macOS 10.11+ x86-64

File details

Details for the file dedupe-1.7.8.tar.gz.

File metadata

  • Download URL: dedupe-1.7.8.tar.gz
  • Upload date:
  • Size: 52.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for dedupe-1.7.8.tar.gz
Algorithm Hash digest
SHA256 f00a9c39d5b1c582d6baf95bdf7e1787abb7c804e1b664b60c9385ba85a822f5
MD5 e4047beff6e0c0ef549419388595a375
BLAKE2b-256 b4ac7e30d5ece2b2be96b3c95f910b019869ef7490910eb9b1c3fd32ace66da7

See more details on using hashes here.

File details

Details for the file dedupe-1.7.8-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.8-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 fb788e399ef45a174f23c1bde4d59f054300b514877d47951fc09c3ad5c2c4d6
MD5 2a75e5245432cd698f25105a2a320345
BLAKE2b-256 bd6de762685f3030ae00a9b85361f35850532f46c978ad5b62ede4c3320acb54

See more details on using hashes here.

File details

Details for the file dedupe-1.7.8-cp36-cp36m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.7.8-cp36-cp36m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 3a38c6958c82e16d09c78fb068022b0c9aa1575d3c3a617acf057305b0054d3d
MD5 8dd28e5358cd446c6e379857415c8db5
BLAKE2b-256 976bcd689bbfa8eb68952f34ad416a86eb4fd6cecc47045e706ea06a1d6004a5

See more details on using hashes here.

File details

Details for the file dedupe-1.7.8-cp36-cp36m-macosx_10_11_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.8-cp36-cp36m-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 895894d929b896ced0c6cf61692c9e8f554e68b6f67bcff81e4bfb8e2544eb7a
MD5 eb224b8ac9e7d6bd99ceeccf2e701b86
BLAKE2b-256 f0d81f8f49b3ea51c17e8b6696223d0c46e4e044f7f2c45647c58bad060e6563

See more details on using hashes here.

File details

Details for the file dedupe-1.7.8-cp35-cp35m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.8-cp35-cp35m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 706de3e5bd590fc38856b7afbdfb887689493fa41945d06c835f70c252eb8ff5
MD5 24da3d0d3d2a2f3822a60b104c8e9a5e
BLAKE2b-256 3a5598fd89a04106d898eb84a2df4538a54fab462e0c8ff9b4571744227c06db

See more details on using hashes here.

File details

Details for the file dedupe-1.7.8-cp35-cp35m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.7.8-cp35-cp35m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 33fe35354cf2136f1d804cec36b8d8003df8837722873989c6e44072111989ff
MD5 13d58167f1b5e52ba7300faa376bfbc9
BLAKE2b-256 9acf4e65ab27e10a9732aa0bdc7af8daeb3379797040b5c50c1cb976cd1a090d

See more details on using hashes here.

File details

Details for the file dedupe-1.7.8-cp34-cp34m-win_amd64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.8-cp34-cp34m-win_amd64.whl
Algorithm Hash digest
SHA256 ce9d596c181756b7b822a174a040930fb084b3c71db7f0c0095a419405077568
MD5 c078baef823679a13c733bc8e4b07d97
BLAKE2b-256 6d88d2423415764ff2ee00074c9355f38a9b94bbe44c1cd3fd9657e95802ff2f

See more details on using hashes here.

File details

Details for the file dedupe-1.7.8-cp34-cp34m-win32.whl.

File metadata

File hashes

Hashes for dedupe-1.7.8-cp34-cp34m-win32.whl
Algorithm Hash digest
SHA256 cef2c84e740ec0ffd1a4fc98d722014463ecec3961774eec7404ef4387503e05
MD5 29f04667a55e0d112274a970965e3feb
BLAKE2b-256 d2badc4cb90695ca595b7b517b5a601f5a31d04a284b18fd376d45651c1d62c0

See more details on using hashes here.

File details

Details for the file dedupe-1.7.8-cp34-cp34m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.8-cp34-cp34m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 fa29005d9369d72395ae94b9d4e5b6396096147d3a50f0e2ca6ffc5e860f8414
MD5 c80013380a6ceb48c5ff52fb61e68a70
BLAKE2b-256 552156032b55d1b59f00542d97e964524d4a0aeb76d5c8f34e0273521ff23472

See more details on using hashes here.

File details

Details for the file dedupe-1.7.8-cp34-cp34m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.7.8-cp34-cp34m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 f6ea2cb0d9a9499a3b3f5283997a59236b96e2f7999ff3937399cb29dc104423
MD5 2fa62e775a04098e0c9553d9e91e1a0d
BLAKE2b-256 bcdcf80abbfa357300558bee092827138319026425f1979b4378a2615ae604c4

See more details on using hashes here.

File details

Details for the file dedupe-1.7.8-cp27-cp27mu-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.8-cp27-cp27mu-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 91b7e2ae17920f41a8c9922bd279d6fca27acd524b0153edc0866642494c49e1
MD5 fb77085b6356d0d4ef8eb189ebeb51a9
BLAKE2b-256 a852361aaadee90a5fef8ad0215cd68a28fb2908b984c6c479ec671125d6b493

See more details on using hashes here.

File details

Details for the file dedupe-1.7.8-cp27-cp27mu-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.7.8-cp27-cp27mu-manylinux1_i686.whl
Algorithm Hash digest
SHA256 fa26a9407e1216f29d27dac77952a651c61cce9a6d64be5fd8435331c95d56b3
MD5 ec35f12648ccc5602ae1b9a3dbe58013
BLAKE2b-256 3432e1b94404aa840afcdf55eb4ca8907db2a564f926830f2b6a7d21ab9b107c

See more details on using hashes here.

File details

Details for the file dedupe-1.7.8-cp27-cp27m-win_amd64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.8-cp27-cp27m-win_amd64.whl
Algorithm Hash digest
SHA256 058c8a646f104313cade52b3855d5437dd49bb94ee2e4fc43170a662976e005c
MD5 83e4778fc17c8e1e3e53a6000462d451
BLAKE2b-256 053afefd7ce2230dfc986ead30d36b3e19799d8f375c3314d9b032bac70e7ac2

See more details on using hashes here.

File details

Details for the file dedupe-1.7.8-cp27-cp27m-win32.whl.

File metadata

File hashes

Hashes for dedupe-1.7.8-cp27-cp27m-win32.whl
Algorithm Hash digest
SHA256 aaefe419b3e322234627ea7396dd105f158603f6ae9049ac4994ac6ca4fe329a
MD5 5ca2b6e531cc62139e5b4618196d0bbb
BLAKE2b-256 cb84d2bb57da10447ae4137cbfd839f16487f451b930dd0159430b43667d4978

See more details on using hashes here.

File details

Details for the file dedupe-1.7.8-cp27-cp27m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.8-cp27-cp27m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 7c81ec70320efea321b413eea5e00488a449e19131f71db2449d748da65b5880
MD5 9a05f91ff83af90ffcae64caa346fdc8
BLAKE2b-256 69e4fe557e2bcdbb6c9e2c252b7dd3acf3c6c70f3dc03d960726bd2b99c6475f

See more details on using hashes here.

File details

Details for the file dedupe-1.7.8-cp27-cp27m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.7.8-cp27-cp27m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 b08b38ae8f7b550fd5e36fb8329a5e3fa0e5df38402f1c7a534ebbda05cb232e
MD5 e0fc001df24959cd9e8967bba004ab8d
BLAKE2b-256 c2f47ce9c95e29064c954919e35c888aa411455c7aaaab2838b9fb62221e138c

See more details on using hashes here.

File details

Details for the file dedupe-1.7.8-cp27-cp27m-macosx_10_11_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.8-cp27-cp27m-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 dbbb981e28792134f302cb7f8ffb97c04bf99cbab8fafbf9335c7482e005b69f
MD5 452e4aeb8b1c35b0db1b7b8c17260ccc
BLAKE2b-256 040ded84f99288714323015596468ac46bef128f6af0180a38f0225f43293a4b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page