Skip to main content

A python library for accurate and scaleable data deduplication and entity-resolution

Project description

dedupe is a library that uses machine learning to perform de-duplication and entity resolution quickly on structured data. dedupe is the open source engine for dedupe.io

dedupe will help you:

  • remove duplicate entries from a spreadsheet of names and addresses

  • link a list with customer information to another with order history, even without unique customer id’s

  • take a database of campaign contributions and figure out which ones were made by the same person, even if the names were entered slightly differently for each record

dedupe takes in human training data and comes up with the best rules for your dataset to quickly and automatically find similar records, even with very large databases.

Important links:

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dedupe-1.9.0.tar.gz (55.7 kB view details)

Uploaded Source

Built Distributions

dedupe-1.9.0-cp36-cp36m-manylinux1_x86_64.whl (79.7 kB view details)

Uploaded CPython 3.6m

dedupe-1.9.0-cp36-cp36m-manylinux1_i686.whl (76.1 kB view details)

Uploaded CPython 3.6m

dedupe-1.9.0-cp36-cp36m-macosx_10_12_x86_64.whl (53.2 kB view details)

Uploaded CPython 3.6m macOS 10.12+ x86-64

dedupe-1.9.0-cp35-cp35m-manylinux1_x86_64.whl (79.5 kB view details)

Uploaded CPython 3.5m

dedupe-1.9.0-cp35-cp35m-manylinux1_i686.whl (75.9 kB view details)

Uploaded CPython 3.5m

dedupe-1.9.0-cp34-cp34m-win_amd64.whl (54.1 kB view details)

Uploaded CPython 3.4m Windows x86-64

dedupe-1.9.0-cp34-cp34m-win32.whl (53.4 kB view details)

Uploaded CPython 3.4m Windows x86

dedupe-1.9.0-cp34-cp34m-manylinux1_x86_64.whl (79.7 kB view details)

Uploaded CPython 3.4m

dedupe-1.9.0-cp34-cp34m-manylinux1_i686.whl (76.0 kB view details)

Uploaded CPython 3.4m

dedupe-1.9.0-cp27-cp27mu-manylinux1_x86_64.whl (76.9 kB view details)

Uploaded CPython 2.7mu

dedupe-1.9.0-cp27-cp27mu-manylinux1_i686.whl (73.9 kB view details)

Uploaded CPython 2.7mu

dedupe-1.9.0-cp27-cp27m-win_amd64.whl (54.0 kB view details)

Uploaded CPython 2.7m Windows x86-64

dedupe-1.9.0-cp27-cp27m-win32.whl (53.1 kB view details)

Uploaded CPython 2.7m Windows x86

dedupe-1.9.0-cp27-cp27m-manylinux1_x86_64.whl (76.9 kB view details)

Uploaded CPython 2.7m

dedupe-1.9.0-cp27-cp27m-manylinux1_i686.whl (73.9 kB view details)

Uploaded CPython 2.7m

dedupe-1.9.0-cp27-cp27m-macosx_10_12_intel.whl (58.7 kB view details)

Uploaded CPython 2.7m macOS 10.12+ intel

File details

Details for the file dedupe-1.9.0.tar.gz.

File metadata

  • Download URL: dedupe-1.9.0.tar.gz
  • Upload date:
  • Size: 55.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for dedupe-1.9.0.tar.gz
Algorithm Hash digest
SHA256 422a5069018aab0a5f3bdbb8643369ba16eaf99afcf4ed212de553ece1456c27
MD5 bafe3097ee9796b1bdde96c605359a91
BLAKE2b-256 082178d26bda5c12d70c86a4eed3a2ddc1feeac05d023fa8ffc8aecb2b36a05c

See more details on using hashes here.

File details

Details for the file dedupe-1.9.0-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.9.0-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 ee7f419ac86ab9ecc876c55b65a79a026623056ba09317d6d9cc42db475b1a9d
MD5 94627c721c5a655ad60f27846bd11ce1
BLAKE2b-256 0739e510c4a0745af2a0dd4d3de72958f1639b682be3c28807491c52b98669c3

See more details on using hashes here.

File details

Details for the file dedupe-1.9.0-cp36-cp36m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.9.0-cp36-cp36m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 8573df6abe3bbed6e7a152951fe110285dd964d720f9878ce44a3490098a1997
MD5 12a4d3b8bc511c35822a6eac1a8ac13d
BLAKE2b-256 75c29e2ca8a7f9352ae546fa8ddd0f2a86ebf1058e18b1e8b9748c438154ba8d

See more details on using hashes here.

File details

Details for the file dedupe-1.9.0-cp36-cp36m-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.9.0-cp36-cp36m-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 19184cd18619e8257f365ca597c0dd9ad17b6f6057cb5121fca4238d65ae393b
MD5 70e5d95f2daf52509fc081e2b514a12b
BLAKE2b-256 00bbe0b2a5bc94caebfc9ae62c20c6f3f3723e08e2541d36fc1f905df6c22e18

See more details on using hashes here.

File details

Details for the file dedupe-1.9.0-cp35-cp35m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.9.0-cp35-cp35m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 88b86d5b49905aca1465c3defae5e62adaa57aad5e9a7e8c9cc2e5a13ab52b50
MD5 229a2e00656edf18f0c8d16160edc318
BLAKE2b-256 7c50dace7b678ad95f8f4b81052d635082d80489e59d303df84e60fd4c592d20

See more details on using hashes here.

File details

Details for the file dedupe-1.9.0-cp35-cp35m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.9.0-cp35-cp35m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 e29cf03bc2ee32485d5cb568bd9015e2ebb585f6c64aa7f1ea5abd34a007b733
MD5 e840f010c47c3a51c83fc5bd09cdca4d
BLAKE2b-256 936efdb705f0a310fd67a047c5c08f6ed710bf6b5b6c558d6978731e9f0930da

See more details on using hashes here.

File details

Details for the file dedupe-1.9.0-cp34-cp34m-win_amd64.whl.

File metadata

File hashes

Hashes for dedupe-1.9.0-cp34-cp34m-win_amd64.whl
Algorithm Hash digest
SHA256 e7a207a10dfe7d07bb3bd1090a5595ffb1d7bda303db982e540af18a162502b2
MD5 c53d388490c5c32cbe841c27cf53785f
BLAKE2b-256 56466fbe1ff5259ba9b1779377a70b6f8472663327ab3ef439eec30e2d646623

See more details on using hashes here.

File details

Details for the file dedupe-1.9.0-cp34-cp34m-win32.whl.

File metadata

File hashes

Hashes for dedupe-1.9.0-cp34-cp34m-win32.whl
Algorithm Hash digest
SHA256 7e67b5d653585ce331cad23b8626232f4c600f97845916652f9e10be0062f54d
MD5 c2e1be6d4d9a531cb9809ba89c8793d1
BLAKE2b-256 ca9285ad9bb7f18a3399924f47def774cb1aa9264306c7e4c0cb3dcc180f261a

See more details on using hashes here.

File details

Details for the file dedupe-1.9.0-cp34-cp34m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.9.0-cp34-cp34m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 ce52c6933b8993e47d3aa76840e5dc30952a6942efb63e35214338e15b6ae5d9
MD5 40d7ed199897b01c784b6b0ad28ec2f0
BLAKE2b-256 adde9530fa742fa6bed6f084240fb5c5f38a9b535a6fa5cd70051e9d7db0b4f4

See more details on using hashes here.

File details

Details for the file dedupe-1.9.0-cp34-cp34m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.9.0-cp34-cp34m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 a59d25273a1b4032151585f29d50dcce1bd14b0d3adf52b59b014a1a49c6fc0f
MD5 afbf88b78277f07e0792009957abdce0
BLAKE2b-256 7043ae990feef3d8388c3d195d3f409df0a6ce00779cb79cf6ee6949887c7662

See more details on using hashes here.

File details

Details for the file dedupe-1.9.0-cp27-cp27mu-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.9.0-cp27-cp27mu-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 db3ba2e9efad19d18ad1fe3bbde0f90a37905ac819c113a3391975fa458878fe
MD5 d4dfd5dbe2711825c529674c0c72d94e
BLAKE2b-256 18ab8012749b0173e61e7ec82eb05e1c5e0800360a94fe93ae4ecf3782af19c4

See more details on using hashes here.

File details

Details for the file dedupe-1.9.0-cp27-cp27mu-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.9.0-cp27-cp27mu-manylinux1_i686.whl
Algorithm Hash digest
SHA256 7f0b74088e6523d5a626866998a9b2d6b7b32d33b3ac50e518941c1949417090
MD5 ebf8d5e3d2f70993a488124ac710a0e8
BLAKE2b-256 f8e5417d90af94b0b3917b90965a61e5dd53dfc9e171c7f0fa5f9b3c5fdb323e

See more details on using hashes here.

File details

Details for the file dedupe-1.9.0-cp27-cp27m-win_amd64.whl.

File metadata

File hashes

Hashes for dedupe-1.9.0-cp27-cp27m-win_amd64.whl
Algorithm Hash digest
SHA256 58620cabf00ffa01607b0443ae3b97bba6d591851dc364d7b11b26c54866ba88
MD5 646962b3c5fad81f6f24fdfb3a18e2bd
BLAKE2b-256 877d945dc4d5870f56cf83ca2049a9c5ce919afb0249c7087ea4ea626d66cc81

See more details on using hashes here.

File details

Details for the file dedupe-1.9.0-cp27-cp27m-win32.whl.

File metadata

File hashes

Hashes for dedupe-1.9.0-cp27-cp27m-win32.whl
Algorithm Hash digest
SHA256 80d799ec423b4d0967caeb914e723955595d97ba9133e8b6db047496d4719ada
MD5 12a2fbd017cff1d0035bc0b873be28ce
BLAKE2b-256 4920c1ef023dc7d62a8bdda530ff9d667e802646f6dbef09c3ca27a03c660f6b

See more details on using hashes here.

File details

Details for the file dedupe-1.9.0-cp27-cp27m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.9.0-cp27-cp27m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 8f44cae71a171b4101e2604800ea67d4fba2d6cf17e45de376d64e01b92c4928
MD5 072b7d1ce01c2a09e66cde3a1a5b1b7f
BLAKE2b-256 83a43a323de5f0a82c69743ec9a80ec3d9c17a02e976167cdd3500539bc10589

See more details on using hashes here.

File details

Details for the file dedupe-1.9.0-cp27-cp27m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.9.0-cp27-cp27m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 5af923e644d0865f6d7c5bc35a91a9ebce9b1ce2f338f064c289423501f6c56a
MD5 d54d2ceb8ee6803a99078fc30ef9d4d8
BLAKE2b-256 0bc1c12c5c07f4a628c0baf77211649679f5e548d7dcbf09d3fb04e34d398c26

See more details on using hashes here.

File details

Details for the file dedupe-1.9.0-cp27-cp27m-macosx_10_12_intel.whl.

File metadata

File hashes

Hashes for dedupe-1.9.0-cp27-cp27m-macosx_10_12_intel.whl
Algorithm Hash digest
SHA256 2b5d1c8beafe5e1264a34bf9b5473046e452f19e4648819d769ebe5eb3917b80
MD5 3786d1fa58acc7f2232754628c85fc77
BLAKE2b-256 b4a16641206ba87c44c590fe23e6c5de75d9d1f008e31a8af1761138db6ee682

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page