Skip to main content

A python library for accurate and scaleable data deduplication and entity-resolution

Project description

dedupe is a library that uses machine learning to perform de-duplication and entity resolution quickly on structured data. dedupe is the open source engine for dedupe.io

dedupe will help you:

  • remove duplicate entries from a spreadsheet of names and addresses

  • link a list with customer information to another with order history, even without unique customer id’s

  • take a database of campaign contributions and figure out which ones were made by the same person, even if the names were entered slightly differently for each record

dedupe takes in human training data and comes up with the best rules for your dataset to quickly and automatically find similar records, even with very large databases.

Important links:

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dedupe-1.7.5.tar.gz (50.4 kB view details)

Uploaded Source

Built Distributions

dedupe-1.7.5-cp36-cp36m-manylinux1_x86_64.whl (77.1 kB view details)

Uploaded CPython 3.6m

dedupe-1.7.5-cp36-cp36m-manylinux1_i686.whl (73.7 kB view details)

Uploaded CPython 3.6m

dedupe-1.7.5-cp36-cp36m-macosx_10_11_x86_64.whl (51.9 kB view details)

Uploaded CPython 3.6m macOS 10.11+ x86-64

dedupe-1.7.5-cp35-cp35m-manylinux1_x86_64.whl (76.9 kB view details)

Uploaded CPython 3.5m

dedupe-1.7.5-cp35-cp35m-manylinux1_i686.whl (73.5 kB view details)

Uploaded CPython 3.5m

dedupe-1.7.5-cp34-cp34m-win_amd64.whl (52.5 kB view details)

Uploaded CPython 3.4m Windows x86-64

dedupe-1.7.5-cp34-cp34m-win32.whl (51.8 kB view details)

Uploaded CPython 3.4m Windows x86

dedupe-1.7.5-cp34-cp34m-manylinux1_x86_64.whl (77.0 kB view details)

Uploaded CPython 3.4m

dedupe-1.7.5-cp34-cp34m-manylinux1_i686.whl (73.7 kB view details)

Uploaded CPython 3.4m

dedupe-1.7.5-cp27-cp27mu-manylinux1_x86_64.whl (74.8 kB view details)

Uploaded CPython 2.7mu

dedupe-1.7.5-cp27-cp27mu-manylinux1_i686.whl (72.0 kB view details)

Uploaded CPython 2.7mu

dedupe-1.7.5-cp27-cp27m-win_amd64.whl (52.7 kB view details)

Uploaded CPython 2.7m Windows x86-64

dedupe-1.7.5-cp27-cp27m-win32.whl (51.8 kB view details)

Uploaded CPython 2.7m Windows x86

dedupe-1.7.5-cp27-cp27m-manylinux1_x86_64.whl (74.7 kB view details)

Uploaded CPython 2.7m

dedupe-1.7.5-cp27-cp27m-manylinux1_i686.whl (72.0 kB view details)

Uploaded CPython 2.7m

dedupe-1.7.5-cp27-cp27m-macosx_10_11_x86_64.whl (51.6 kB view details)

Uploaded CPython 2.7m macOS 10.11+ x86-64

File details

Details for the file dedupe-1.7.5.tar.gz.

File metadata

  • Download URL: dedupe-1.7.5.tar.gz
  • Upload date:
  • Size: 50.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for dedupe-1.7.5.tar.gz
Algorithm Hash digest
SHA256 6dc1065b4e42baebcabc57e4ebb58c7d1b7d125d737371f7f47137f724b18dfc
MD5 d5e1ec08fac2f48b8a1fe877eca3fa15
BLAKE2b-256 5a3c0769fcf1ef24cb0b2572dbfaa1fc9ba0e0809e62cedd0cf683722328952c

See more details on using hashes here.

File details

Details for the file dedupe-1.7.5-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.5-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 30274ee7774c27c5ad176296c8a1bd5f44840cc5ab6b7597c31b4b90a203aab8
MD5 47f40ca9c01b11bc911285b515bb2bd2
BLAKE2b-256 9c4c980aabf9269ab9bbe6b70ddbd222c6c3620db56429d5c29727dcbd902742

See more details on using hashes here.

File details

Details for the file dedupe-1.7.5-cp36-cp36m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.7.5-cp36-cp36m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 ac799597ae193e341a6d69ec49d910d27fe6c0a8a69827b36fae7d3151842a6e
MD5 b1e595aac61b6fa4ef7b90adf3fa45f1
BLAKE2b-256 85171fc670a68ac3f4d8fae516baa7884255582ac57f3871782df0d37e3b350e

See more details on using hashes here.

File details

Details for the file dedupe-1.7.5-cp36-cp36m-macosx_10_11_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.5-cp36-cp36m-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 d4a9856f12cb83a5c0f1a11d809f447f11e4a439824a3b8e75f0adc1fcf495db
MD5 c6ed59b4c8552d47b13d302c63d03c67
BLAKE2b-256 f01b14b5b451d5d3dcc9203d07efb305608d6a29aeb9e6a35dfb8c0ec7fe7157

See more details on using hashes here.

File details

Details for the file dedupe-1.7.5-cp35-cp35m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.5-cp35-cp35m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 5c19d5d6ba76818a726074374104e1f9f2c372927bdedea985035af850eda4ec
MD5 4d2247c7cc944bab92a28d0de616105c
BLAKE2b-256 f0274efcd0688271c21ddff6d491097780ecbf98e0cc9d0fa60cf86f39664bd6

See more details on using hashes here.

File details

Details for the file dedupe-1.7.5-cp35-cp35m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.7.5-cp35-cp35m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 d8c3934cc246655167748d0924a87624a78130ab46ef4f4a2a0eeb3145980ce9
MD5 ad0ada62e0aa2729f37e8e9017760a94
BLAKE2b-256 a46cfb070ccaa435c042a3d9d8e6d06fbfd200faae1e8ccb2dfc2c6a465307ec

See more details on using hashes here.

File details

Details for the file dedupe-1.7.5-cp34-cp34m-win_amd64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.5-cp34-cp34m-win_amd64.whl
Algorithm Hash digest
SHA256 7d9ba68864448a23bd2873c84e56f681f835d23cd4baac2771d378484994a00f
MD5 56414647404ef1408b224c982b957e5e
BLAKE2b-256 2d4464c9d0bb4783865cf86d8f17057842b432c9b1fd5f6733a35ae9ccee85ad

See more details on using hashes here.

File details

Details for the file dedupe-1.7.5-cp34-cp34m-win32.whl.

File metadata

File hashes

Hashes for dedupe-1.7.5-cp34-cp34m-win32.whl
Algorithm Hash digest
SHA256 dcdfdd15605a519c544e14cbc3e7421cf3558dba57dd0560e90d05bea8a8e2a5
MD5 a77215310aafb650343d7442c4613366
BLAKE2b-256 773b053406b59eac07509d97fe489dda327ed888da5181b307935202d8d67183

See more details on using hashes here.

File details

Details for the file dedupe-1.7.5-cp34-cp34m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.5-cp34-cp34m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 a7fd9d65d5bf5a5b79c06612a9c6d1a73b9dfe43b7fae3816583bf79107ea16e
MD5 50d87aee4a1c2d04bc4b2a6b42241019
BLAKE2b-256 bafe8d2714a7fe8c2786fe7822e4d0d03118096ffc80413d4bad0cdbd1163e53

See more details on using hashes here.

File details

Details for the file dedupe-1.7.5-cp34-cp34m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.7.5-cp34-cp34m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 1057c1c948e9bcd58f5de8c8d8cba5c1cb1da2caa84150c7c1ccdac52284752c
MD5 39e11caf26846688dbbf63c3635bf148
BLAKE2b-256 e558f451255a6f254f3e29f25c1e3c669c126ed1eadb4bcb38cc25803970b777

See more details on using hashes here.

File details

Details for the file dedupe-1.7.5-cp27-cp27mu-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.5-cp27-cp27mu-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 c8963b5b39f2acc94e2ed32e40560d05273e7f61aa1e8fa9ee60605677ff5a65
MD5 159cc6c96fa1fde0d82230adbcc52d8d
BLAKE2b-256 080506ec9c69b0fbeeee107a74d2264d524b630b319de234c517e5366d9cb614

See more details on using hashes here.

File details

Details for the file dedupe-1.7.5-cp27-cp27mu-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.7.5-cp27-cp27mu-manylinux1_i686.whl
Algorithm Hash digest
SHA256 ddee15285f8236719c0e38a264df2fedaa3d103bdb4512183a029ad10bdd054b
MD5 318c22513e9a1fe086ccbed51097f407
BLAKE2b-256 2fe48e20ac63476263b9b7cc6b932fa78f9c7794025c12b6c936afef60682496

See more details on using hashes here.

File details

Details for the file dedupe-1.7.5-cp27-cp27m-win_amd64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.5-cp27-cp27m-win_amd64.whl
Algorithm Hash digest
SHA256 388e8cd1a1a8b859aa2b64690871f0f86257dd777e03b66d15a980da3ab32cd4
MD5 f97248164725aaaecb9e414a9c87c5e4
BLAKE2b-256 b6cbba46b6d250e6c989c8c0d1568c84815422eea9819d2bba6c16e52b825d8f

See more details on using hashes here.

File details

Details for the file dedupe-1.7.5-cp27-cp27m-win32.whl.

File metadata

File hashes

Hashes for dedupe-1.7.5-cp27-cp27m-win32.whl
Algorithm Hash digest
SHA256 a01e941ada31425d21191bc399c85ef5d6e6859a76cf34971d202e91bc3d2d67
MD5 f79f676a21cd3e44ad1d9b5c5d60dd37
BLAKE2b-256 147daacaa6339c8f7a0f942583bd1244cb2532f94b035f1f58edd85f445198c9

See more details on using hashes here.

File details

Details for the file dedupe-1.7.5-cp27-cp27m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.5-cp27-cp27m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 2535d3aa94b3820995874cec2c04dea2c64005c466e4a6a25d4fe48bc7bd7959
MD5 045756bcc34110cdccd783063835a7c7
BLAKE2b-256 0160d6bd46f3a3caa52c27bc3b717ee98b5af1e5435f81df15449890e5dfa929

See more details on using hashes here.

File details

Details for the file dedupe-1.7.5-cp27-cp27m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.7.5-cp27-cp27m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 15d01428d13b68ead42f3b88db135fcd9d159acc1c81acb0cf343bff764801c9
MD5 f75e6eb81c695fda8cd6fde4d4646511
BLAKE2b-256 df62bfd65161b5c2c805bb79a152068d4f7c7edb1f76c1e91360cc07dbf8bbfe

See more details on using hashes here.

File details

Details for the file dedupe-1.7.5-cp27-cp27m-macosx_10_11_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.5-cp27-cp27m-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 80c25bea634060652a6ffbaa1ef2c44a60a2897ca006043e47062d427c04ae7a
MD5 206f1226c05121a13807876752058087
BLAKE2b-256 9bf6b3ce9fa611e62267aa0e7cbebcc5416d07296f56c011ebe1e06b0312273e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page