Skip to main content

A python library for accurate and scaleable data deduplication and entity-resolution

Project description

dedupe is a library that uses machine learning to perform de-duplication and entity resolution quickly on structured data. dedupe is the open source engine for dedupe.io

dedupe will help you:

  • remove duplicate entries from a spreadsheet of names and addresses

  • link a list with customer information to another with order history, even without unique customer id’s

  • take a database of campaign contributions and figure out which ones were made by the same person, even if the names were entered slightly differently for each record

dedupe takes in human training data and comes up with the best rules for your dataset to quickly and automatically find similar records, even with very large databases.

Important links:

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dedupe-1.7.4.tar.gz (49.6 kB view details)

Uploaded Source

Built Distributions

dedupe-1.7.4-cp36-cp36m-manylinux1_x86_64.whl (76.0 kB view details)

Uploaded CPython 3.6m

dedupe-1.7.4-cp36-cp36m-manylinux1_i686.whl (72.7 kB view details)

Uploaded CPython 3.6m

dedupe-1.7.4-cp36-cp36m-macosx_10_11_x86_64.whl (50.9 kB view details)

Uploaded CPython 3.6m macOS 10.11+ x86-64

dedupe-1.7.4-cp35-cp35m-manylinux1_x86_64.whl (75.8 kB view details)

Uploaded CPython 3.5m

dedupe-1.7.4-cp35-cp35m-manylinux1_i686.whl (72.5 kB view details)

Uploaded CPython 3.5m

dedupe-1.7.4-cp34-cp34m-win_amd64.whl (51.5 kB view details)

Uploaded CPython 3.4m Windows x86-64

dedupe-1.7.4-cp34-cp34m-win32.whl (50.8 kB view details)

Uploaded CPython 3.4m Windows x86

dedupe-1.7.4-cp34-cp34m-manylinux1_x86_64.whl (76.0 kB view details)

Uploaded CPython 3.4m

dedupe-1.7.4-cp34-cp34m-manylinux1_i686.whl (72.6 kB view details)

Uploaded CPython 3.4m

dedupe-1.7.4-cp27-cp27mu-manylinux1_x86_64.whl (73.7 kB view details)

Uploaded CPython 2.7mu

dedupe-1.7.4-cp27-cp27mu-manylinux1_i686.whl (71.0 kB view details)

Uploaded CPython 2.7mu

dedupe-1.7.4-cp27-cp27m-win_amd64.whl (51.6 kB view details)

Uploaded CPython 2.7m Windows x86-64

dedupe-1.7.4-cp27-cp27m-win32.whl (50.8 kB view details)

Uploaded CPython 2.7m Windows x86

dedupe-1.7.4-cp27-cp27m-manylinux1_x86_64.whl (73.7 kB view details)

Uploaded CPython 2.7m

dedupe-1.7.4-cp27-cp27m-manylinux1_i686.whl (71.0 kB view details)

Uploaded CPython 2.7m

dedupe-1.7.4-cp27-cp27m-macosx_10_11_x86_64.whl (50.5 kB view details)

Uploaded CPython 2.7m macOS 10.11+ x86-64

File details

Details for the file dedupe-1.7.4.tar.gz.

File metadata

  • Download URL: dedupe-1.7.4.tar.gz
  • Upload date:
  • Size: 49.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for dedupe-1.7.4.tar.gz
Algorithm Hash digest
SHA256 4a3bc7b47f9ead408604ac41abc0b5c0a6ce691aede40742643930533fdd152c
MD5 f1178d6898b1b9550acd2d8dba0c2578
BLAKE2b-256 4cb467726ca47507b0e5635d03b39668192a666c8511e1ac05727c69e368acf3

See more details on using hashes here.

File details

Details for the file dedupe-1.7.4-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.4-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 fb3e71f4cf542e1be915c852b01402b5e3be9148cddefb80b55cf22a59566971
MD5 25613247ab747a244f6ee8e2bbf6a72d
BLAKE2b-256 5eadfc32744cd9f4f8020f62d167240c14f347a5dc82866def87ba8756849808

See more details on using hashes here.

File details

Details for the file dedupe-1.7.4-cp36-cp36m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.7.4-cp36-cp36m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 54fdaad4180c34d2300a2aaece6bbb6794eda7f09ddbbd136e064d582b285238
MD5 ab71a2997291736a4f166688336f4e00
BLAKE2b-256 ac4bbd1b2a6ada13cd2d9c4864a0373e00a4d20eb6e1e0bccbe7e838a6cf2c6f

See more details on using hashes here.

File details

Details for the file dedupe-1.7.4-cp36-cp36m-macosx_10_11_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.4-cp36-cp36m-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 13d84ccb798bfaf73dfaac22a7757fbe028bbb30b0fb502a48f1c7d25c59aeae
MD5 ed572a07857cb6bfe673f306b6fb0f6f
BLAKE2b-256 a9c54ad65482dd816da70773eccb27b0f2617f42a9cbc1f73e61b0490f484fea

See more details on using hashes here.

File details

Details for the file dedupe-1.7.4-cp35-cp35m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.4-cp35-cp35m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 60bfb7916c38ec1b5ff90de524e86cf92239d3200e20c0c2a6be785e3f609edf
MD5 f451d7f547e546628447cf413b2dff40
BLAKE2b-256 ee55bf11a7fcabd3cb2099935dc1d0e7d9a00814820a347a1088e89d8bc30b44

See more details on using hashes here.

File details

Details for the file dedupe-1.7.4-cp35-cp35m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.7.4-cp35-cp35m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 a878c2e760ba966abc0d0ef0b1888fbd34d477ed25888ab01e299a548e87342a
MD5 fd4b17e6e061c9519d2bd9a0a3451d27
BLAKE2b-256 ae9d621e1c76008f84d967b269d34b2cd053f39431d187a2cbca8747ad336bf4

See more details on using hashes here.

File details

Details for the file dedupe-1.7.4-cp34-cp34m-win_amd64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.4-cp34-cp34m-win_amd64.whl
Algorithm Hash digest
SHA256 b7fc4ee4de4fd8f4c59bbcba9f2e6171f79206e9e651c4383561eac8e3892682
MD5 d0b52444085fe176b6ab70b3b7d9c3dd
BLAKE2b-256 27cb50269da6a16956f14e7a14b7b951edb8e745f92a025c0169999cc0c865a4

See more details on using hashes here.

File details

Details for the file dedupe-1.7.4-cp34-cp34m-win32.whl.

File metadata

File hashes

Hashes for dedupe-1.7.4-cp34-cp34m-win32.whl
Algorithm Hash digest
SHA256 f248ad4135cc303a3a9b5aff08340ee2f15c8e31e8d3a439a9e572cd57bfc9da
MD5 118d09fab116a796e20022dc32a7290e
BLAKE2b-256 752b9d7668fb1fbbcf3002c01f3186b60254d75ad78fb8d4f088338fdf484295

See more details on using hashes here.

File details

Details for the file dedupe-1.7.4-cp34-cp34m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.4-cp34-cp34m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 b082c2edcab7120b0bd902f95778b3ba153331aeed573fa528f013a43402d294
MD5 12bf32431363631b5131e3e8030986a3
BLAKE2b-256 925f69e0417e73aab279e87a4dce2af835caf149b4291f92999d0ddad6e1ac8c

See more details on using hashes here.

File details

Details for the file dedupe-1.7.4-cp34-cp34m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.7.4-cp34-cp34m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 b65f947171a7be67a588c9caf702a3b59c7e95bc3cb98ab64a03dc9d43465963
MD5 f64f3e7ebd86736c97d43e4ddd6bb00a
BLAKE2b-256 3efd5955d2d33fcfbbc093ff6816284637fee60b414bca4dfcff8db195e6dc8d

See more details on using hashes here.

File details

Details for the file dedupe-1.7.4-cp27-cp27mu-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.4-cp27-cp27mu-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 e70d6a7e0d808a50c380d694465e61c2711dd8d88a65f9870a9d051d420ff342
MD5 e0e1559828cd9accf713d21dd43a3140
BLAKE2b-256 c192b3aa9d6caf343a02ba913c55c54b8e9540cba9fdb5f07f0f37c46b93afa6

See more details on using hashes here.

File details

Details for the file dedupe-1.7.4-cp27-cp27mu-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.7.4-cp27-cp27mu-manylinux1_i686.whl
Algorithm Hash digest
SHA256 6a6f505713ef1d678d64fc8bc859a94eb29b56be00aff0185ad5fc474a0a6e79
MD5 12eaf1dafc43269be40e5f3236323d6b
BLAKE2b-256 beed3abde754c963423b3423feb3bb4188e51be0b6428ecfccf2baa8f51916b7

See more details on using hashes here.

File details

Details for the file dedupe-1.7.4-cp27-cp27m-win_amd64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.4-cp27-cp27m-win_amd64.whl
Algorithm Hash digest
SHA256 f5fa9963cd627ec5f36102064b9fdee6dd71c5c51e23a3291da8e676a921c2ce
MD5 7a69b3550cf029f09c48b20ba7ca7898
BLAKE2b-256 25fd07c0f19ad5d0ac5931d97c105a4a01463c41c1663b2305b668b0ced67ccc

See more details on using hashes here.

File details

Details for the file dedupe-1.7.4-cp27-cp27m-win32.whl.

File metadata

File hashes

Hashes for dedupe-1.7.4-cp27-cp27m-win32.whl
Algorithm Hash digest
SHA256 033c17960fb859dd6e0d04c00fe118ae517e789d00bd05fbfab52145572aed8a
MD5 784aaf0fa6fd65292fbfd273929ae56e
BLAKE2b-256 0ef2321b67d54c6a2256f774c84c8e0a1d3a6a437dcfb0cf5987981280b00b5b

See more details on using hashes here.

File details

Details for the file dedupe-1.7.4-cp27-cp27m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.4-cp27-cp27m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 eba60a46cb04c342dce3913afde1fe9cdb50cb7937575cd85b849ea9fb6688f2
MD5 c707ac1bf519223ef33a9bfe57df16f8
BLAKE2b-256 5c1496502801a33d7f116b10ef3e653198c4c741efb893248ebcdbbdb1113e0e

See more details on using hashes here.

File details

Details for the file dedupe-1.7.4-cp27-cp27m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.7.4-cp27-cp27m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 def05b0e736f24fb8fb5b20b8f4980b97fb8201f54abfcbc1a22f8fa063e9576
MD5 8962b9f6e828ad59fcafe72e887ae2da
BLAKE2b-256 c7314ed44f65b89f76bfdf67ce517e5fb1796722b3f9c916c6fb26fef4a2d8c9

See more details on using hashes here.

File details

Details for the file dedupe-1.7.4-cp27-cp27m-macosx_10_11_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.4-cp27-cp27m-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 fcddc5e4fd6b5729c438e88a574157f1e65f03f0e142eef9d0d4ad829e854ee5
MD5 0b4b171936c5b785910617c2c2f76e40
BLAKE2b-256 f8142511f32b8a9c18abe8b7f903d02883eae0c4c5ade2a354e65469621e0f08

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page