Skip to main content

A python library for accurate and scaleable data deduplication and entity-resolution

Project description

dedupe is a library that uses machine learning to perform de-duplication and entity resolution quickly on structured data. dedupe is the open source engine for dedupe.io

dedupe will help you:

  • remove duplicate entries from a spreadsheet of names and addresses

  • link a list with customer information to another with order history, even without unique customer id’s

  • take a database of campaign contributions and figure out which ones were made by the same person, even if the names were entered slightly differently for each record

dedupe takes in human training data and comes up with the best rules for your dataset to quickly and automatically find similar records, even with very large databases.

Important links:

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dedupe-1.9.1.tar.gz (55.7 kB view details)

Uploaded Source

Built Distributions

dedupe-1.9.1-cp36-cp36m-manylinux1_x86_64.whl (79.7 kB view details)

Uploaded CPython 3.6m

dedupe-1.9.1-cp36-cp36m-manylinux1_i686.whl (76.1 kB view details)

Uploaded CPython 3.6m

dedupe-1.9.1-cp36-cp36m-macosx_10_12_x86_64.whl (53.3 kB view details)

Uploaded CPython 3.6m macOS 10.12+ x86-64

dedupe-1.9.1-cp35-cp35m-manylinux1_x86_64.whl (79.5 kB view details)

Uploaded CPython 3.5m

dedupe-1.9.1-cp35-cp35m-manylinux1_i686.whl (75.9 kB view details)

Uploaded CPython 3.5m

dedupe-1.9.1-cp34-cp34m-win_amd64.whl (54.1 kB view details)

Uploaded CPython 3.4m Windows x86-64

dedupe-1.9.1-cp34-cp34m-win32.whl (53.4 kB view details)

Uploaded CPython 3.4m Windows x86

dedupe-1.9.1-cp34-cp34m-manylinux1_x86_64.whl (79.7 kB view details)

Uploaded CPython 3.4m

dedupe-1.9.1-cp34-cp34m-manylinux1_i686.whl (76.0 kB view details)

Uploaded CPython 3.4m

dedupe-1.9.1-cp27-cp27mu-manylinux1_x86_64.whl (76.9 kB view details)

Uploaded CPython 2.7mu

dedupe-1.9.1-cp27-cp27mu-manylinux1_i686.whl (73.9 kB view details)

Uploaded CPython 2.7mu

dedupe-1.9.1-cp27-cp27m-win_amd64.whl (54.0 kB view details)

Uploaded CPython 2.7m Windows x86-64

dedupe-1.9.1-cp27-cp27m-win32.whl (53.2 kB view details)

Uploaded CPython 2.7m Windows x86

dedupe-1.9.1-cp27-cp27m-manylinux1_x86_64.whl (76.9 kB view details)

Uploaded CPython 2.7m

dedupe-1.9.1-cp27-cp27m-manylinux1_i686.whl (73.9 kB view details)

Uploaded CPython 2.7m

dedupe-1.9.1-cp27-cp27m-macosx_10_12_intel.whl (58.7 kB view details)

Uploaded CPython 2.7m macOS 10.12+ intel

File details

Details for the file dedupe-1.9.1.tar.gz.

File metadata

  • Download URL: dedupe-1.9.1.tar.gz
  • Upload date:
  • Size: 55.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for dedupe-1.9.1.tar.gz
Algorithm Hash digest
SHA256 9bd2e0e21a924532b4c57b886389a5462682c16860237a53f5f67e3acae47b1a
MD5 1feb16c576cec72388f70a4a6bc21ae6
BLAKE2b-256 7c614985281e582e3b99daa94b87189e343398f89138e7fa4fd9c8d04f82dd1e

See more details on using hashes here.

File details

Details for the file dedupe-1.9.1-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.9.1-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 afffad36c6ed4e5947489c69fa536689c23b94110773eebcd8deac85044fffef
MD5 012c53727144e8f6f6980a5b1d44e82b
BLAKE2b-256 e629065bdd5c46adccd67bbb294dddc956bbbb5e7c8244d5056754508e586c79

See more details on using hashes here.

File details

Details for the file dedupe-1.9.1-cp36-cp36m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.9.1-cp36-cp36m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 6448eb83a52610eda3672cc05265ccb29b1c1b2cdd3e81288650094099d7411f
MD5 18f31e5ba0627543c929614e560d7262
BLAKE2b-256 f76d4d142cce5fad259bc9ca9353d5ebae6b4cc8342e4c96281b0a2d41a5b3f2

See more details on using hashes here.

File details

Details for the file dedupe-1.9.1-cp36-cp36m-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.9.1-cp36-cp36m-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 44d11b9cca16c74ee20ed971ea984060cde330cd6a75a712e0e022f6d044dc30
MD5 d66c72e7c48d3006cb800982c725d565
BLAKE2b-256 9191700b5469e9e7d440e2ea79281e1ea1e48cabb7895855507ea423513d9569

See more details on using hashes here.

File details

Details for the file dedupe-1.9.1-cp35-cp35m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.9.1-cp35-cp35m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 7418bf5b94d79bbc3426abf9f7f54f35e294059ce545cd59e95b1533d66c2cf9
MD5 129aa7ce5e7c1ee37736d47dd2b3e3a6
BLAKE2b-256 0855396ad0bcaa2f0d5b6d3f46b58e8783fae2c4d149eebd409f8e7e096fb431

See more details on using hashes here.

File details

Details for the file dedupe-1.9.1-cp35-cp35m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.9.1-cp35-cp35m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 6249683f4c0b68713713ea34379bcdca1d7c39a903a61894ffd841626e312a0c
MD5 23e4a80980a459a71e4c1be1baaaff58
BLAKE2b-256 294e542f6881f9251e4fc412a66e215715d9b36363a8d58aae26e7ccfc06f726

See more details on using hashes here.

File details

Details for the file dedupe-1.9.1-cp34-cp34m-win_amd64.whl.

File metadata

File hashes

Hashes for dedupe-1.9.1-cp34-cp34m-win_amd64.whl
Algorithm Hash digest
SHA256 fd98feae5cb34d876fa649eaded84c7f1b1ecf2bca43973710a269039f60187c
MD5 46a569191a42147644736909137a268c
BLAKE2b-256 501704bdb76d83e0941345578777394b9ffd2fe638d804357aa39db9102ee37e

See more details on using hashes here.

File details

Details for the file dedupe-1.9.1-cp34-cp34m-win32.whl.

File metadata

File hashes

Hashes for dedupe-1.9.1-cp34-cp34m-win32.whl
Algorithm Hash digest
SHA256 7278804db99fdf6f344dfd258145da3c7f9c70af65f1741337f8ed5e5e3f2c22
MD5 50b7f515e4c3354b1f0e7dd488978cf5
BLAKE2b-256 70923a2d9fc2b8ff0e7ad992b1c50ae8924ba01bfd6351b3a41efe8e397e7768

See more details on using hashes here.

File details

Details for the file dedupe-1.9.1-cp34-cp34m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.9.1-cp34-cp34m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 1d589b0b55135c92895e4de6261cead2cbb5d5598a075930c737420adb8020f0
MD5 d7e7fe7c0755b2faa84bd28b1a0362bc
BLAKE2b-256 64c50117c737451890490c7ffcc5537510a7d2b6b984e49b653d5ffb3f09e8b6

See more details on using hashes here.

File details

Details for the file dedupe-1.9.1-cp34-cp34m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.9.1-cp34-cp34m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 9aa45f14253b7579df276d12cf6e3d4dd00397d5b1111a23e5c2f42fe912f26f
MD5 6c35ecf56b22bf878537f3dd8847f92c
BLAKE2b-256 be33b8780de5e2764c8515c13322ecf6e1276282d62ac0c87511420dcdaeaf56

See more details on using hashes here.

File details

Details for the file dedupe-1.9.1-cp27-cp27mu-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.9.1-cp27-cp27mu-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 7736ae0eb02d1db6ab2d21699628149ec5112931450a4eb80acba3da3bfd0bab
MD5 3936cb6e688a10a710dbb64315698f89
BLAKE2b-256 c5e73f35778a367163ff3c2549e39494f3c7e48bf71f326acaa20a9adc7df1ea

See more details on using hashes here.

File details

Details for the file dedupe-1.9.1-cp27-cp27mu-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.9.1-cp27-cp27mu-manylinux1_i686.whl
Algorithm Hash digest
SHA256 f3bf6d3737349612eb57d5b4a1cc556bfae0ef10de5731ca521b7734bdfb7bb2
MD5 39f31d28c560b85a7a320c82e2de54af
BLAKE2b-256 7bfe8d8f1e0943d90b9211d4c504062d64241c62111a53b1aec25129e39e8b04

See more details on using hashes here.

File details

Details for the file dedupe-1.9.1-cp27-cp27m-win_amd64.whl.

File metadata

File hashes

Hashes for dedupe-1.9.1-cp27-cp27m-win_amd64.whl
Algorithm Hash digest
SHA256 967e3dc0df1930ab17922f1a53f316dcc68842d593ebf6e374be8905a8a58951
MD5 15fabbb1c37d8697950f3d3b4a934ed4
BLAKE2b-256 68c47cb56c52345bd85035198616f36a4a9495a2a4bb022d6fa7d24fe246562a

See more details on using hashes here.

File details

Details for the file dedupe-1.9.1-cp27-cp27m-win32.whl.

File metadata

File hashes

Hashes for dedupe-1.9.1-cp27-cp27m-win32.whl
Algorithm Hash digest
SHA256 fc91e6633484eebf723bd06bd8b0c501bbc8cd4ef461b6a716d21e9f8db0edc7
MD5 070fdc49bbcc42598a8217fff499c050
BLAKE2b-256 76dfcbcc3bb1e80d3603c1d05e3b82f9c7c630caceffd3d11cc7ed7d899a59fe

See more details on using hashes here.

File details

Details for the file dedupe-1.9.1-cp27-cp27m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.9.1-cp27-cp27m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 9ab9d8d79588138a96bbab4966eb7656e697f131dd23b0c13dd0af92a61817fa
MD5 9ff44550116969284a76d48574eaba7e
BLAKE2b-256 45aa20783e8d2cba6c0edc75bf3b1d333dcdaaca7c569b4beff6bb5290c7ab78

See more details on using hashes here.

File details

Details for the file dedupe-1.9.1-cp27-cp27m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.9.1-cp27-cp27m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 35733f5c1c8f182309813d8429c4396c07df7a09ee98fee80549fb2248604cca
MD5 b0f1c0ea877fe1a8006e8345915294a2
BLAKE2b-256 40e4b6bd1ec5cd9309a922b2709acb875dbbff286cbe2c44c15e7f14066167a8

See more details on using hashes here.

File details

Details for the file dedupe-1.9.1-cp27-cp27m-macosx_10_12_intel.whl.

File metadata

File hashes

Hashes for dedupe-1.9.1-cp27-cp27m-macosx_10_12_intel.whl
Algorithm Hash digest
SHA256 a22a987f2cf04a414005eed92b6f0c3c1b223e0b35dbf774174856dbdf93734e
MD5 8faa88fafb586cd0dcbdb1bb89494b46
BLAKE2b-256 45bef7ff2b5538943aca5e3b77ec07a7461f377704efc99f9a2fdb09a15943e7

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page