Skip to main content

A python library for accurate and scaleable data deduplication and entity-resolution

Project description

dedupe is a library that uses machine learning to perform de-duplication and entity resolution quickly on structured data.

dedupe will help you:

  • remove duplicate entries from a spreadsheet of names and addresses

  • link a list with customer information to another with order history, even without unique customer id’s

  • take a database of campaign contributions and figure out which ones were made by the same person, even if the names were entered slightly differently for each record

dedupe takes in human training data and comes up with the best rules for your dataset to quickly and automatically find similar records, even with very large databases.

Important links:

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dedupe-1.6.1.tar.gz (47.2 kB view details)

Uploaded Source

Built Distributions

dedupe-1.6.1-cp35-cp35m-manylinux1_x86_64.whl (73.5 kB view details)

Uploaded CPython 3.5m

dedupe-1.6.1-cp35-cp35m-manylinux1_i686.whl (70.2 kB view details)

Uploaded CPython 3.5m

dedupe-1.6.1-cp35-cp35m-macosx_10_11_x86_64.whl (49.4 kB view details)

Uploaded CPython 3.5m macOS 10.11+ x86-64

dedupe-1.6.1-cp34-cp34m-win_amd64.whl (50.1 kB view details)

Uploaded CPython 3.4m Windows x86-64

dedupe-1.6.1-cp34-cp34m-win32.whl (49.4 kB view details)

Uploaded CPython 3.4m Windows x86

dedupe-1.6.1-cp34-cp34m-manylinux1_x86_64.whl (73.7 kB view details)

Uploaded CPython 3.4m

dedupe-1.6.1-cp34-cp34m-manylinux1_i686.whl (70.4 kB view details)

Uploaded CPython 3.4m

dedupe-1.6.1-cp27-cp27mu-manylinux1_x86_64.whl (71.4 kB view details)

Uploaded CPython 2.7mu

dedupe-1.6.1-cp27-cp27mu-manylinux1_i686.whl (68.7 kB view details)

Uploaded CPython 2.7mu

dedupe-1.6.1-cp27-cp27m-win_amd64.whl (50.2 kB view details)

Uploaded CPython 2.7m Windows x86-64

dedupe-1.6.1-cp27-cp27m-win32.whl (49.3 kB view details)

Uploaded CPython 2.7m Windows x86

dedupe-1.6.1-cp27-cp27m-manylinux1_x86_64.whl (71.3 kB view details)

Uploaded CPython 2.7m

dedupe-1.6.1-cp27-cp27m-manylinux1_i686.whl (68.7 kB view details)

Uploaded CPython 2.7m

dedupe-1.6.1-cp27-cp27m-macosx_10_11_x86_64.whl (49.0 kB view details)

Uploaded CPython 2.7m macOS 10.11+ x86-64

File details

Details for the file dedupe-1.6.1.tar.gz.

File metadata

  • Download URL: dedupe-1.6.1.tar.gz
  • Upload date:
  • Size: 47.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for dedupe-1.6.1.tar.gz
Algorithm Hash digest
SHA256 b2676abf98c08d3fc41c857304c2b937c5dfab5be6fd64f00175aa2b5614274f
MD5 a9d941179360e6a3fc199d3503f4bec3
BLAKE2b-256 171e9d43d35edf0839fb1f36b40446426a0bb0675fce5cb02cf82caac0814141

See more details on using hashes here.

File details

Details for the file dedupe-1.6.1-cp35-cp35m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.1-cp35-cp35m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 a542f6be8bdf6ade95d42603385c407a896d1aac3a835e2a6355b7e0d0b444ac
MD5 94afbf26e9b9b3ea3b5908bcb5818251
BLAKE2b-256 087b9b2c9488bd43489b26c36cfe67e514043c7c00f5405c981c3972f4309591

See more details on using hashes here.

File details

Details for the file dedupe-1.6.1-cp35-cp35m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.6.1-cp35-cp35m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 bd76232560cc61d0c23050655190ccb3443550dbc9e2cd1f517fa4fbd876a952
MD5 03709b34912eeda944a348026064701e
BLAKE2b-256 ea9b642daff02f84475e58df32079dde77766dfe1590d5dc8b2936a8f5c6995b

See more details on using hashes here.

File details

Details for the file dedupe-1.6.1-cp35-cp35m-macosx_10_11_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.1-cp35-cp35m-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 cf8b9b5171e5da1468cf7081904e8c06cf1cf2d60bd3f323df5e6be9043de753
MD5 a7ccb99928ce75d7bd5eb89e520dfa05
BLAKE2b-256 a24617e956ddef1e3f3ad894191cb25ae8a422d27245eeb37b00c7dfdce58aba

See more details on using hashes here.

File details

Details for the file dedupe-1.6.1-cp34-cp34m-win_amd64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.1-cp34-cp34m-win_amd64.whl
Algorithm Hash digest
SHA256 8eaa1bfdd65e229934711d84f3f95dd1c44d2ed9fe3edb024c573b41c3338ed1
MD5 d2cd6ff63d24dc2904fb0a2d1f1e12b8
BLAKE2b-256 cc1124025c5ef08f66f7a5e0be506d433f5571cc69a4efc6df9775d305e70fe5

See more details on using hashes here.

File details

Details for the file dedupe-1.6.1-cp34-cp34m-win32.whl.

File metadata

File hashes

Hashes for dedupe-1.6.1-cp34-cp34m-win32.whl
Algorithm Hash digest
SHA256 3fc9745cda3fd3f80895533f925071f6ada41d461be11c0de61cb5b185d282e8
MD5 f6e67237c7c6689eef08d13881af6b4f
BLAKE2b-256 8fa2a36e03f63d57052537832b9a61524f84d11dcf346d22ae03b94d94a11115

See more details on using hashes here.

File details

Details for the file dedupe-1.6.1-cp34-cp34m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.1-cp34-cp34m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 e98e4865d9cebf38f07d0ab52516106746c4ef511ebd44498427152b54067875
MD5 b2a0b3e054492b879c7de28cb4147213
BLAKE2b-256 9ff7cea61d7483138b9c4cb6906253f4aacae1e5c838b2167ec0c5335092565c

See more details on using hashes here.

File details

Details for the file dedupe-1.6.1-cp34-cp34m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.6.1-cp34-cp34m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 3f1126e41f6ec3fa7e36db948c811a5d869eb9c712440748d9343906e60f79e7
MD5 e95471f5727352d1fbf499262e9c10d0
BLAKE2b-256 4bf0561f1617fd1335f24257946325cf17c156dbfd826adec87180d439402b28

See more details on using hashes here.

File details

Details for the file dedupe-1.6.1-cp27-cp27mu-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.1-cp27-cp27mu-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 39c4a04f5e24f47cba15ab83c4163780f77a67a294587c0d312a40698c7418ed
MD5 5e3244ac1b207fa0a284a8ae026a0d2c
BLAKE2b-256 150277b0fa4e04e1529fd7159b0849634086a7b637fd729d50aba680fd4e2f93

See more details on using hashes here.

File details

Details for the file dedupe-1.6.1-cp27-cp27mu-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.6.1-cp27-cp27mu-manylinux1_i686.whl
Algorithm Hash digest
SHA256 fa950186d449248f43194c6021bc65530fbe90899a434cbb8b9ebe34d5942829
MD5 f088c176fe5477efcc848b270012d78e
BLAKE2b-256 2f8fefc5fe802eb16d6a782e7a347fd1e4b8ac9bda551b541d70c76f02ed5986

See more details on using hashes here.

File details

Details for the file dedupe-1.6.1-cp27-cp27m-win_amd64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.1-cp27-cp27m-win_amd64.whl
Algorithm Hash digest
SHA256 44c3949a9e7b22dfeb0aef43c7e121d7e94942451f5950bc4af2f6c0381858f4
MD5 f38f13319441b41efa69a5e884432f8d
BLAKE2b-256 77405e67ed72e4f2389ec17f8a0b043a047971144471d2aaf6faac9997d4c089

See more details on using hashes here.

File details

Details for the file dedupe-1.6.1-cp27-cp27m-win32.whl.

File metadata

File hashes

Hashes for dedupe-1.6.1-cp27-cp27m-win32.whl
Algorithm Hash digest
SHA256 416c0c513b655e539886f90c6d11c49d181ffe4281e1566a5da0838c9ecb9d10
MD5 1d11e481a3c5f14566874b44a1b3d6f8
BLAKE2b-256 23143770563bfedca549afe711b0a1f3f8d1a9f52b5cbd2345ac06d4c83a8f43

See more details on using hashes here.

File details

Details for the file dedupe-1.6.1-cp27-cp27m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.1-cp27-cp27m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 ed5c38f14f8dd7c5d560f74c6d2c89b11e6b64899134d57d8f99116d8a1d6357
MD5 34d60ad17b600585529411306b529714
BLAKE2b-256 958c1a4df9e9604ded682f72d909ae97865b62f2c16e87e4688d1f5199a68788

See more details on using hashes here.

File details

Details for the file dedupe-1.6.1-cp27-cp27m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.6.1-cp27-cp27m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 4fadca8f0f1c8825e1cd3f331b8ceda165aba9f9d4e645bb89b63b77275500a2
MD5 eb2edbfd28a86c3dd05f518373d96510
BLAKE2b-256 11f2cc3f9fd4de19a2ab3c5976d1f395974783caf49ef0853113a9e6b9d1abb6

See more details on using hashes here.

File details

Details for the file dedupe-1.6.1-cp27-cp27m-macosx_10_11_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.6.1-cp27-cp27m-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 0b8135762ec74f5a5e24de3472d3e51f7c1829f2f438c285063758766238c793
MD5 106bdc98bff73fa8360d5f7031ae0b29
BLAKE2b-256 1697004f9ec75d26d2aa3dc20d9c005f9f8c4e79693972bb9e3f9fff5dde36f5

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page