Skip to main content

A python library for accurate and scaleable data deduplication and entity-resolution

Project description

dedupe is a library that uses machine learning to perform de-duplication and entity resolution quickly on structured data. dedupe is the open source engine for dedupe.io

dedupe will help you:

  • remove duplicate entries from a spreadsheet of names and addresses

  • link a list with customer information to another with order history, even without unique customer id’s

  • take a database of campaign contributions and figure out which ones were made by the same person, even if the names were entered slightly differently for each record

dedupe takes in human training data and comes up with the best rules for your dataset to quickly and automatically find similar records, even with very large databases.

Important links:

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dedupe-1.6.16.tar.gz (48.7 kB view details)

Uploaded Source

Built Distributions

dedupe-1.6.16-cp36-cp36m-manylinux1_x86_64.whl (74.5 kB view details)

Uploaded CPython 3.6m

dedupe-1.6.16-cp36-cp36m-manylinux1_i686.whl (71.2 kB view details)

Uploaded CPython 3.6m

dedupe-1.6.16-cp36-cp36m-macosx_10_11_x86_64.whl (50.2 kB view details)

Uploaded CPython 3.6m macOS 10.11+ x86-64

dedupe-1.6.16-cp35-cp35m-manylinux1_x86_64.whl (74.3 kB view details)

Uploaded CPython 3.5m

dedupe-1.6.16-cp35-cp35m-manylinux1_i686.whl (71.0 kB view details)

Uploaded CPython 3.5m

dedupe-1.6.16-cp34-cp34m-win_amd64.whl (50.9 kB view details)

Uploaded CPython 3.4m Windows x86-64

dedupe-1.6.16-cp34-cp34m-win32.whl (50.2 kB view details)

Uploaded CPython 3.4m Windows x86

dedupe-1.6.16-cp34-cp34m-manylinux1_x86_64.whl (74.5 kB view details)

Uploaded CPython 3.4m

dedupe-1.6.16-cp34-cp34m-manylinux1_i686.whl (71.2 kB view details)

Uploaded CPython 3.4m

dedupe-1.6.16-cp27-cp27mu-manylinux1_x86_64.whl (72.2 kB view details)

Uploaded CPython 2.7mu

dedupe-1.6.16-cp27-cp27mu-manylinux1_i686.whl (69.5 kB view details)

Uploaded CPython 2.7mu

dedupe-1.6.16-cp27-cp27m-win_amd64.whl (51.0 kB view details)

Uploaded CPython 2.7m Windows x86-64

dedupe-1.6.16-cp27-cp27m-win32.whl (50.2 kB view details)

Uploaded CPython 2.7m Windows x86

dedupe-1.6.16-cp27-cp27m-manylinux1_x86_64.whl (72.1 kB view details)

Uploaded CPython 2.7m

dedupe-1.6.16-cp27-cp27m-manylinux1_i686.whl (69.5 kB view details)

Uploaded CPython 2.7m

dedupe-1.6.16-cp27-cp27m-macosx_10_11_x86_64.whl (49.8 kB view details)

Uploaded CPython 2.7m macOS 10.11+ x86-64

File details

Details for the file dedupe-1.6.16.tar.gz.

File metadata

  • Download URL: dedupe-1.6.16.tar.gz
  • Upload date:
  • Size: 48.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for dedupe-1.6.16.tar.gz
Algorithm Hash digest
SHA256 3192577dbfc55883fc31c3d48cb67a6e69cacc72104182d9ab5de8499d6317e1
MD5 0bab5a852d9c832422ef44a93eecc420
BLAKE2b-256 47a4d60dca42d1884360ccdf62ebc3368949cc7292304518094c84e1cc134706

See more details on using hashes here.

File details

Details for the file dedupe-1.6.16-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.16-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 9718de6e996bcca8534f1234c36bb1390e5e70fcd289eead8dcdbd9da4df1460
MD5 309cbedc1a54f1b650e7fa88b5df90fb
BLAKE2b-256 9e52d04bc986c193b0e2cf04d467ad85248af02c50e67d121f0325bebb7759cc

See more details on using hashes here.

File details

Details for the file dedupe-1.6.16-cp36-cp36m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.6.16-cp36-cp36m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 05a42ee8a82048a92783f40adaf39fa8f20ab5728fa35eb8d714077d9e2bc231
MD5 b8e7cc1b8a0f56f54866b3d1dd003095
BLAKE2b-256 97d0ee871efbdeea3cb1b936525d09e9fd50dd9cd32aca20ed8addb10fe46ce9

See more details on using hashes here.

File details

Details for the file dedupe-1.6.16-cp36-cp36m-macosx_10_11_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.16-cp36-cp36m-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 d98636c431e81f6c8627e1b718c50a8c8a34b9fd81e363c9c605ac23a57c1420
MD5 b3c67eff66bfb30d4ffcc47075dea30b
BLAKE2b-256 81ef3f192a725c264137206abe12b4ed8aee2eb76263e041ef177dfd7441a446

See more details on using hashes here.

File details

Details for the file dedupe-1.6.16-cp35-cp35m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.16-cp35-cp35m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 2ff22cad4fd79998a5c6614e2c9754db474f676b9a566ec9e0cf5564b11cdbee
MD5 5a26f6e6c34ea1cc37d5f1cea2bc5eae
BLAKE2b-256 4dbe93ff9547edd6eb3202922a3ec4237b05b75e2374785d18b671582b0e6606

See more details on using hashes here.

File details

Details for the file dedupe-1.6.16-cp35-cp35m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.6.16-cp35-cp35m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 868288b72cd8c219b10272ae17719549cce37250fc36e9e9eb12c603758bc166
MD5 1870ef026c009abc723392d9b9f518b2
BLAKE2b-256 e14ec677d2839c7097dd415f72a48a6c25438f096c22249c77c34f70709c4015

See more details on using hashes here.

File details

Details for the file dedupe-1.6.16-cp34-cp34m-win_amd64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.16-cp34-cp34m-win_amd64.whl
Algorithm Hash digest
SHA256 11bcd04115d3e348d7eee462e47929d9c013c7cc6ec8eea7084f8d003c828dbf
MD5 8b55f8f5259af98dabb1e8c225d30188
BLAKE2b-256 4923948add8772b5e31ed2abd047b6d73de62f2c513dc02d91f01dceb5a151f9

See more details on using hashes here.

File details

Details for the file dedupe-1.6.16-cp34-cp34m-win32.whl.

File metadata

File hashes

Hashes for dedupe-1.6.16-cp34-cp34m-win32.whl
Algorithm Hash digest
SHA256 c5894c83458bb2c49feb80dca4f9ff881a306c0b4ba82342710de0aedf3ccd88
MD5 047e54e2402252eb3c996fd0bee9acaa
BLAKE2b-256 c47bafa4b647451b1d9bbc9c0e102e8ddaf981a58c4e51e6a43cb611f9291fc5

See more details on using hashes here.

File details

Details for the file dedupe-1.6.16-cp34-cp34m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.16-cp34-cp34m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 5140f08bd8071da49ab12d29c05f7cc1ba22ed35d1364b244f45c72f2946b051
MD5 061b9423fa8c8f0ad95619874524e291
BLAKE2b-256 e10ceca09d628c80dc25e11b3b9b66caa9db838b8c25d3ef49f8038953f0ff9d

See more details on using hashes here.

File details

Details for the file dedupe-1.6.16-cp34-cp34m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.6.16-cp34-cp34m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 c40423b62108b87bb42368cd100a17b19ddc4a8610ebcb6dfa12bbaf95f1a81c
MD5 276e53aef5f3cc6527467a99933b04b7
BLAKE2b-256 339ad03648e82fce0f1bb1817c73ca3be5758e34b8415c2ef2517f7d578046ed

See more details on using hashes here.

File details

Details for the file dedupe-1.6.16-cp27-cp27mu-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.16-cp27-cp27mu-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 41e2d5e72e913e0a89b4db8c641aedfc2d4ba9b865c025d56da325a7ad68203a
MD5 329e2e015a96f82e73a9055c53ca5078
BLAKE2b-256 c907cd75c0627c89ba388966ddbd3b13a7a5dabed54d73ccccf81aa87a5267cd

See more details on using hashes here.

File details

Details for the file dedupe-1.6.16-cp27-cp27mu-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.6.16-cp27-cp27mu-manylinux1_i686.whl
Algorithm Hash digest
SHA256 aa165f9f889d3c58808013683b52ce7e9a6bcdd9592a9885722a6c2b2640811a
MD5 c6828df194d76aa8a7bcb67bb9fbcecc
BLAKE2b-256 3bc09e02cc9d28b3eab9ddc577136d81603c4e00d175d4a70843f0987990fe2e

See more details on using hashes here.

File details

Details for the file dedupe-1.6.16-cp27-cp27m-win_amd64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.16-cp27-cp27m-win_amd64.whl
Algorithm Hash digest
SHA256 5d5db81767b9a8121a34ff773526b3e00b2cd0c29288246205dc200aecbfe3b5
MD5 a6118c8cf57173d9e6ded7366d854a81
BLAKE2b-256 06004eee814700defe03979788c918a30cdcb586bc4592511c393ccc38976fd3

See more details on using hashes here.

File details

Details for the file dedupe-1.6.16-cp27-cp27m-win32.whl.

File metadata

File hashes

Hashes for dedupe-1.6.16-cp27-cp27m-win32.whl
Algorithm Hash digest
SHA256 12bcae3b0b3dd7e63215509d6a0f28e98420c133d596f922d5ca8e9bc0b907cc
MD5 a28b93696781471183360ca4d0b768f6
BLAKE2b-256 2c6fc15b9a454768e48b9dfc4b29664f400442b99396eee78a8ad37609132236

See more details on using hashes here.

File details

Details for the file dedupe-1.6.16-cp27-cp27m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.16-cp27-cp27m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 624f62b4e9094cce4a067b158ef4a4e3d7945769cbbe2d8d0f082f64382f0c8a
MD5 97526ddc55eb9059ea3b8769c2a9d67f
BLAKE2b-256 10521bc3b057757754af66b4afe9918bf59700b51fe76b38ba06ce7ce00eccfa

See more details on using hashes here.

File details

Details for the file dedupe-1.6.16-cp27-cp27m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.6.16-cp27-cp27m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 967a214107098d8a6981e6d8df14b1a351c11971ce0a0aceafe19a97fa5c28d3
MD5 2ce128d6af7ea6a332621e63b5e83899
BLAKE2b-256 1796e53f248915572c7d2d5efd0a45e47af6b33ad7a2d2839fa8a08db54dd49e

See more details on using hashes here.

File details

Details for the file dedupe-1.6.16-cp27-cp27m-macosx_10_11_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.16-cp27-cp27m-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 f3e03fce05836470f91df3d266916c5023b4130d56a2705ad764066f10780fc4
MD5 d53622d4c05df738166b0e08b57f051f
BLAKE2b-256 1a73c3bf37584698e36edd664750736432e34c553380bcfe2016ee30b01e3a0a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page