Skip to main content

A python library for accurate and scaleable data deduplication and entity-resolution

Project description

dedupe is a library that uses machine learning to perform de-duplication and entity resolution quickly on structured data.

dedupe will help you:

  • remove duplicate entries from a spreadsheet of names and addresses

  • link a list with customer information to another with order history, even without unique customer id’s

  • take a database of campaign contributions and figure out which ones were made by the same person, even if the names were entered slightly differently for each record

dedupe takes in human training data and comes up with the best rules for your dataset to quickly and automatically find similar records, even with very large databases.

Important links:

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dedupe-1.6.6.tar.gz (47.3 kB view details)

Uploaded Source

Built Distributions

dedupe-1.6.6-cp36-cp36m-manylinux1_x86_64.whl (73.9 kB view details)

Uploaded CPython 3.6m

dedupe-1.6.6-cp36-cp36m-manylinux1_i686.whl (70.6 kB view details)

Uploaded CPython 3.6m

dedupe-1.6.6-cp36-cp36m-macosx_10_11_x86_64.whl (49.6 kB view details)

Uploaded CPython 3.6m macOS 10.11+ x86-64

dedupe-1.6.6-cp35-cp35m-manylinux1_x86_64.whl (73.7 kB view details)

Uploaded CPython 3.5m

dedupe-1.6.6-cp35-cp35m-manylinux1_i686.whl (70.4 kB view details)

Uploaded CPython 3.5m

dedupe-1.6.6-cp34-cp34m-win_amd64.whl (50.3 kB view details)

Uploaded CPython 3.4m Windows x86-64

dedupe-1.6.6-cp34-cp34m-win32.whl (49.5 kB view details)

Uploaded CPython 3.4m Windows x86

dedupe-1.6.6-cp34-cp34m-manylinux1_x86_64.whl (73.9 kB view details)

Uploaded CPython 3.4m

dedupe-1.6.6-cp34-cp34m-manylinux1_i686.whl (70.6 kB view details)

Uploaded CPython 3.4m

dedupe-1.6.6-cp27-cp27mu-manylinux1_x86_64.whl (71.6 kB view details)

Uploaded CPython 2.7mu

dedupe-1.6.6-cp27-cp27mu-manylinux1_i686.whl (68.9 kB view details)

Uploaded CPython 2.7mu

dedupe-1.6.6-cp27-cp27m-win_amd64.whl (50.4 kB view details)

Uploaded CPython 2.7m Windows x86-64

dedupe-1.6.6-cp27-cp27m-win32.whl (49.5 kB view details)

Uploaded CPython 2.7m Windows x86

dedupe-1.6.6-cp27-cp27m-manylinux1_x86_64.whl (71.5 kB view details)

Uploaded CPython 2.7m

dedupe-1.6.6-cp27-cp27m-manylinux1_i686.whl (68.9 kB view details)

Uploaded CPython 2.7m

dedupe-1.6.6-cp27-cp27m-macosx_10_11_x86_64.whl (49.2 kB view details)

Uploaded CPython 2.7m macOS 10.11+ x86-64

File details

Details for the file dedupe-1.6.6.tar.gz.

File metadata

  • Download URL: dedupe-1.6.6.tar.gz
  • Upload date:
  • Size: 47.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for dedupe-1.6.6.tar.gz
Algorithm Hash digest
SHA256 b37367432e923820c756d2722590ba99b93f0b4583f8861013973dc9a2bbc686
MD5 fc3da44ed86e0fa26b61fe27be1c1ef0
BLAKE2b-256 e6edc40826271acf1fe30403be98372fc5fdaa7e5775a9bb5aad69ca206f1fb7

See more details on using hashes here.

File details

Details for the file dedupe-1.6.6-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.6-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 7101519dc138025bd55d0076bc87c94f9cd51a299fd5ed218f0217ba6adefc6b
MD5 0d60cf908dd3ae97b45dc3fac1fd48ea
BLAKE2b-256 a7ab3adcc700bc6b99871fdb1e9acba2242eef109ec7c72d5dc24913e5c964bd

See more details on using hashes here.

File details

Details for the file dedupe-1.6.6-cp36-cp36m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.6.6-cp36-cp36m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 a45fae5c5c5702bbe6fbde15509a61feca849986971fe0ad32075ad96401f522
MD5 307566e6a05a2253463e9622c067bce3
BLAKE2b-256 90012316c38451550a256b64875c2d0a7bae18c7a199f28fea7271bebaae7b68

See more details on using hashes here.

File details

Details for the file dedupe-1.6.6-cp36-cp36m-macosx_10_11_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.6-cp36-cp36m-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 85c5cb2c1611729a86c04b5bb39df00e3070cb10801eb5e61af310f5fcb6cc79
MD5 a0d4613d1a5041940426bf7b3b68281c
BLAKE2b-256 935aa5f915b7c74a6252c45b2c89d6e341c196104024dd1239ca463e9fe8a8b9

See more details on using hashes here.

File details

Details for the file dedupe-1.6.6-cp35-cp35m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.6-cp35-cp35m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 e85e418cff454ca8d734e52d4a53759bea4d30e986c4855a02f96ba1b282c08d
MD5 b623963e622f6fe7dc9e3d42de1efd3d
BLAKE2b-256 5a19748d97acb390424ce167cd9c9e82dff2a3a6873b1c0146ed73b0428d63c1

See more details on using hashes here.

File details

Details for the file dedupe-1.6.6-cp35-cp35m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.6.6-cp35-cp35m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 d6d4dda8e1ad516c405e11b2c348c443d6e8d0a726592695394733ab8c78c125
MD5 09a0dbd81de578363ae26cd7f4b6a8a7
BLAKE2b-256 4985ec3dcce2f9024978bd3728d454093d2c22aecb06577b08d811289c6e72dc

See more details on using hashes here.

File details

Details for the file dedupe-1.6.6-cp34-cp34m-win_amd64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.6-cp34-cp34m-win_amd64.whl
Algorithm Hash digest
SHA256 c0804cb39ffe5ee5ad277e9c919b69da127cca2f94605484b756f30f2601f79e
MD5 b091ca2b7876ac37dbccdaf2544cdc50
BLAKE2b-256 fede50a3cf8bbd395d567b105da6f25b6adda7a1d887b566083bb4cb5430d8dd

See more details on using hashes here.

File details

Details for the file dedupe-1.6.6-cp34-cp34m-win32.whl.

File metadata

File hashes

Hashes for dedupe-1.6.6-cp34-cp34m-win32.whl
Algorithm Hash digest
SHA256 996ae9b24b812e3c56fab198aab6dd5046ae5b6624c237f20915e8d0c9b828ee
MD5 616cb7fd9a109de106c7044f458807ce
BLAKE2b-256 7f8a317222e72380f1d05af53f9b7cabeea0c57ce554b34d72a4bfbd89b545e9

See more details on using hashes here.

File details

Details for the file dedupe-1.6.6-cp34-cp34m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.6-cp34-cp34m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 7c1c5593b1622319c442f7cd41ee969623ade08ed4dc18a1925a03d68806c8cb
MD5 65614a2cbc086f78ec581feef3b1bf1e
BLAKE2b-256 3abfe6163bf3aa53c56676478e8ed2cf6966f5af1dec52f2a0a576431a2227d8

See more details on using hashes here.

File details

Details for the file dedupe-1.6.6-cp34-cp34m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.6.6-cp34-cp34m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 680d23f2ea261ad3b879161eec5edf4e8c2d703d87eefa7e447e6879914f472e
MD5 0626472cbfb543b569c265181e0ef030
BLAKE2b-256 4d39acfff374411239229a1d51ba4d544896d0abc69b128249fd364547960297

See more details on using hashes here.

File details

Details for the file dedupe-1.6.6-cp27-cp27mu-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.6-cp27-cp27mu-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 e7bdd22113c6e51519f9a26d846bba3c095e873d618ba3e120011443888f6b95
MD5 0aa99e29d497b0210c9c0c6cbb5bc8b5
BLAKE2b-256 0f88b7421726a55cd016cd87e623c806749339a0613bafd15c5a3792458635b7

See more details on using hashes here.

File details

Details for the file dedupe-1.6.6-cp27-cp27mu-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.6.6-cp27-cp27mu-manylinux1_i686.whl
Algorithm Hash digest
SHA256 0d8a3cc4d967bc17167df8ff4d5de418074fa04bc169485b46460586ce57ad73
MD5 6193491c348add5debda06cb0c03a8ae
BLAKE2b-256 1d6f401d6d35a31e9ab5f0fea59cc05aa2368e1bff0c8c3c2b061b049e839c05

See more details on using hashes here.

File details

Details for the file dedupe-1.6.6-cp27-cp27m-win_amd64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.6-cp27-cp27m-win_amd64.whl
Algorithm Hash digest
SHA256 4fa75024bce7353ad7a5be6875b17a2da1532bae4f58d703ced9eed5835809ae
MD5 4eb4723d241581dab5e16769273fb209
BLAKE2b-256 79fd161b35be1ca70863a0510cefd1efc3ebe9dab1dd4d881873e3c2b2984a4c

See more details on using hashes here.

File details

Details for the file dedupe-1.6.6-cp27-cp27m-win32.whl.

File metadata

File hashes

Hashes for dedupe-1.6.6-cp27-cp27m-win32.whl
Algorithm Hash digest
SHA256 0a0d2cfad16db643f51c91c5890654003c76c517dcd7b35b97c1a6f80dbfad68
MD5 d1db526cff1548aa01425dc650ae03af
BLAKE2b-256 e3818819ee7cb531bc38bfcc8df62cf8ad809dc9c1ab503e26e69fbb6deb0dc8

See more details on using hashes here.

File details

Details for the file dedupe-1.6.6-cp27-cp27m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.6-cp27-cp27m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 27d921b28d81a97c61fe134ff3544223fc4e41ed15248490d1e72f6e214f1cbc
MD5 4409808709e4f64659a6a7336535196b
BLAKE2b-256 509214874591e0b504e38d28f2c81062dbfbc2d7f50c3e7958210c50f329e949

See more details on using hashes here.

File details

Details for the file dedupe-1.6.6-cp27-cp27m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.6.6-cp27-cp27m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 247b7a29760638234fdca8c59c9bac7d0d0d41b639ff5cfbdd5b92c1554d0b70
MD5 3539bfdef478957ad184149734d14fcf
BLAKE2b-256 75d80b67449f16efff99774398fa9bdf31a87e574be4688ab3a37fa41535d771

See more details on using hashes here.

File details

Details for the file dedupe-1.6.6-cp27-cp27m-macosx_10_11_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.6-cp27-cp27m-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 ce8cdb7e244eea21b2b17289c1a5d8248258bb1094305817b59bd10df26297e9
MD5 66af8e20732ac499da4b6881978e6e2a
BLAKE2b-256 f84552d90b127d1b7f977a1fb87619d89404a5b514d98b69f74c246f21bcd09a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page