Skip to main content

A python library for accurate and scaleable data deduplication and entity-resolution

Project description

dedupe is a library that uses machine learning to perform de-duplication and entity resolution quickly on structured data.

dedupe will help you:

  • remove duplicate entries from a spreadsheet of names and addresses

  • link a list with customer information to another with order history, even without unique customer id’s

  • take a database of campaign contributions and figure out which ones were made by the same person, even if the names were entered slightly differently for each record

dedupe takes in human training data and comes up with the best rules for your dataset to quickly and automatically find similar records, even with very large databases.

Important links:

Project details


Release history Release notifications | RSS feed

This version

1.6.3

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dedupe-1.6.3.tar.gz (47.2 kB view details)

Uploaded Source

Built Distributions

dedupe-1.6.3-cp35-cp35m-manylinux1_x86_64.whl (73.6 kB view details)

Uploaded CPython 3.5m

dedupe-1.6.3-cp35-cp35m-manylinux1_i686.whl (70.3 kB view details)

Uploaded CPython 3.5m

dedupe-1.6.3-cp35-cp35m-macosx_10_11_x86_64.whl (49.5 kB view details)

Uploaded CPython 3.5m macOS 10.11+ x86-64

dedupe-1.6.3-cp34-cp34m-win_amd64.whl (50.2 kB view details)

Uploaded CPython 3.4m Windows x86-64

dedupe-1.6.3-cp34-cp34m-win32.whl (49.5 kB view details)

Uploaded CPython 3.4m Windows x86

dedupe-1.6.3-cp34-cp34m-manylinux1_x86_64.whl (73.8 kB view details)

Uploaded CPython 3.4m

dedupe-1.6.3-cp34-cp34m-manylinux1_i686.whl (70.5 kB view details)

Uploaded CPython 3.4m

dedupe-1.6.3-cp27-cp27mu-manylinux1_x86_64.whl (71.5 kB view details)

Uploaded CPython 2.7mu

dedupe-1.6.3-cp27-cp27mu-manylinux1_i686.whl (68.8 kB view details)

Uploaded CPython 2.7mu

dedupe-1.6.3-cp27-cp27m-win_amd64.whl (50.3 kB view details)

Uploaded CPython 2.7m Windows x86-64

dedupe-1.6.3-cp27-cp27m-win32.whl (49.5 kB view details)

Uploaded CPython 2.7m Windows x86

dedupe-1.6.3-cp27-cp27m-manylinux1_x86_64.whl (71.5 kB view details)

Uploaded CPython 2.7m

dedupe-1.6.3-cp27-cp27m-manylinux1_i686.whl (68.9 kB view details)

Uploaded CPython 2.7m

dedupe-1.6.3-cp27-cp27m-macosx_10_11_x86_64.whl (49.1 kB view details)

Uploaded CPython 2.7m macOS 10.11+ x86-64

File details

Details for the file dedupe-1.6.3.tar.gz.

File metadata

  • Download URL: dedupe-1.6.3.tar.gz
  • Upload date:
  • Size: 47.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for dedupe-1.6.3.tar.gz
Algorithm Hash digest
SHA256 2b788e7b8446b85b1c8b497f5adc6020a941bea722ebfe05cfd371b30cbc7b5e
MD5 4188a368252d8d17b2fab0bec23188ae
BLAKE2b-256 6f18a3223accfa82a007b69cd01c0c7055f2f5d92b3e88208b227e575d7f5033

See more details on using hashes here.

File details

Details for the file dedupe-1.6.3-cp35-cp35m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.3-cp35-cp35m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 436965e66e512db02668836e7ec1397a417d884ab3c42102ee14798b298a407d
MD5 f23cd24934aee18ac2bfae6927bf2212
BLAKE2b-256 88fc8d68fdbb4f4f763a62a626a81de30200b940fda4ea4c8a7daad01f9491b9

See more details on using hashes here.

File details

Details for the file dedupe-1.6.3-cp35-cp35m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.6.3-cp35-cp35m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 24b71d87de3970795d35f1ac8448de57d77c121b324374753afa6d91f3db79ec
MD5 d71c99b3d11628267c9900ee5a23113d
BLAKE2b-256 05a57ab411d6b2aa1cfd00b72f5c3aeb0c8283fb7d25c425f352d12d85550a2c

See more details on using hashes here.

File details

Details for the file dedupe-1.6.3-cp35-cp35m-macosx_10_11_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.3-cp35-cp35m-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 a1964e1d305c567871a85117b3c0361bb8a0102e5585e830035bbdda78728268
MD5 766b83b23ea098e20479d39c84aadd00
BLAKE2b-256 6345c4b40209797b088f1cf91e50cd611da14e443cfb87129360127a3a5eaccd

See more details on using hashes here.

File details

Details for the file dedupe-1.6.3-cp34-cp34m-win_amd64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.3-cp34-cp34m-win_amd64.whl
Algorithm Hash digest
SHA256 5fba3b9a6f9e05cce2a2d38dcfb77207072878707b72098fc7a696d8a855ad3e
MD5 35f495be8e47db062964e9e4fb1d1443
BLAKE2b-256 4acae05ad21be52bf772d33c08bc04a185e0d1180bfd7a61248bbe8b3e608c1a

See more details on using hashes here.

File details

Details for the file dedupe-1.6.3-cp34-cp34m-win32.whl.

File metadata

File hashes

Hashes for dedupe-1.6.3-cp34-cp34m-win32.whl
Algorithm Hash digest
SHA256 b66771c6c49645f0ca8f0019cbb82c6e1547f936efcd7946be5563fb7a808f5c
MD5 adca467581677f7964ae429aa1cb645b
BLAKE2b-256 6698ecbb0b050ea8f33d884fea8fe2a5fa99b338c7e5653918cd0e3f35f0cb42

See more details on using hashes here.

File details

Details for the file dedupe-1.6.3-cp34-cp34m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.3-cp34-cp34m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 db85d03813882588e2f6a441b19aea289021de904469f0389114983d858e3320
MD5 69dfe539cb8c95fce6c3e290d7d5490e
BLAKE2b-256 811a48f03975657100d488edd23445243505444a40c8b0f5238825810b8e28cc

See more details on using hashes here.

File details

Details for the file dedupe-1.6.3-cp34-cp34m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.6.3-cp34-cp34m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 f4684bd293d53f53201b00636aa1baeea80d12ee30dc2f1f727513c64acc38b2
MD5 98bb0cdc3311a1acf9b93cc5a83dcc0e
BLAKE2b-256 3f80d41cb9584967f5d4ce52e856410ae2393e363ec9ed3f2b91adc3218b44af

See more details on using hashes here.

File details

Details for the file dedupe-1.6.3-cp27-cp27mu-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.3-cp27-cp27mu-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 4e7ce979712d7be6695f460f430a75f799f59ebb40eb7c58019d2a7639d563ad
MD5 9f09093a5a3d9390ad6cede2d7fe6b54
BLAKE2b-256 83201dd8db531b6fcdae0f907d2216501fafaa0c43b9b922536b0d7826cda8fd

See more details on using hashes here.

File details

Details for the file dedupe-1.6.3-cp27-cp27mu-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.6.3-cp27-cp27mu-manylinux1_i686.whl
Algorithm Hash digest
SHA256 0b146caf5e2a56ad7f0996049fc406fe05ab1767dae00fdabb07cc83330e5d3e
MD5 278e062f10d53c41b00aeede0f9942a4
BLAKE2b-256 d43e2686b894250729031c9462e89f111eb069da065b51e59b999e2f83ec9569

See more details on using hashes here.

File details

Details for the file dedupe-1.6.3-cp27-cp27m-win_amd64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.3-cp27-cp27m-win_amd64.whl
Algorithm Hash digest
SHA256 6d2f87766d3a153f2cb05eb1876123ea543f010ff492440112bfcce3a4f55fdb
MD5 220fed7c6e6e2a25ca9684078f6a2030
BLAKE2b-256 a9c47043cbb1e6b6b745018cf00fe2c6da590c058bacb16e00d518e37f4932f9

See more details on using hashes here.

File details

Details for the file dedupe-1.6.3-cp27-cp27m-win32.whl.

File metadata

File hashes

Hashes for dedupe-1.6.3-cp27-cp27m-win32.whl
Algorithm Hash digest
SHA256 40797bc3902fe291717181196ecd449ef0bc32b4418c6b6ec20a0f57f7639425
MD5 b8aba3495b30d6b89aa2ccda69b8f684
BLAKE2b-256 a69ed3c457da79bd1feab6ac4cd19865b435e2bd25f4a053a305510f861021ea

See more details on using hashes here.

File details

Details for the file dedupe-1.6.3-cp27-cp27m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.3-cp27-cp27m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 a03eac11b482933b00b6b3e34bc5d721dc6915ff7146170e357ad00b95a16201
MD5 4affe387e9df9f20443ade5c1a5f40ab
BLAKE2b-256 1919d0abc9930b92d312009c930bdb7e984979d80ee4ab512db48c771783c53f

See more details on using hashes here.

File details

Details for the file dedupe-1.6.3-cp27-cp27m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.6.3-cp27-cp27m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 3bd1cdbf189a48877b2eb34f79d559c22386402d8416f872c69e1217cd59003c
MD5 a125aa3791780dcfb0201ea3bfbe1486
BLAKE2b-256 601d610c103d3c85248c836fcac0558bae290701c49ef4188aac6d8ab518aeb2

See more details on using hashes here.

File details

Details for the file dedupe-1.6.3-cp27-cp27m-macosx_10_11_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.3-cp27-cp27m-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 c166fab46f1e0c1393b47b11746c15739b5e8d1310847a2b46dfdeb3cd92f9da
MD5 47f34cd12d8ced5199071c15e87169b2
BLAKE2b-256 2b2f226140696e99676e85c7b958bd687d99247cb27885fbe624436969fd10df

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page