Skip to main content

A python library for accurate and scaleable data deduplication and entity-resolution

Project description

dedupe is a library that uses machine learning to perform de-duplication and entity resolution quickly on structured data. dedupe is the open source engine for dedupe.io

dedupe will help you:

  • remove duplicate entries from a spreadsheet of names and addresses

  • link a list with customer information to another with order history, even without unique customer id’s

  • take a database of campaign contributions and figure out which ones were made by the same person, even if the names were entered slightly differently for each record

dedupe takes in human training data and comes up with the best rules for your dataset to quickly and automatically find similar records, even with very large databases.

Important links:

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dedupe-1.8.2.tar.gz (54.4 kB view details)

Uploaded Source

Built Distributions

dedupe-1.8.2-cp36-cp36m-manylinux1_x86_64.whl (78.4 kB view details)

Uploaded CPython 3.6m

dedupe-1.8.2-cp36-cp36m-manylinux1_i686.whl (74.7 kB view details)

Uploaded CPython 3.6m

dedupe-1.8.2-cp36-cp36m-macosx_10_12_x86_64.whl (51.9 kB view details)

Uploaded CPython 3.6m macOS 10.12+ x86-64

dedupe-1.8.2-cp35-cp35m-manylinux1_x86_64.whl (78.2 kB view details)

Uploaded CPython 3.5m

dedupe-1.8.2-cp35-cp35m-manylinux1_i686.whl (74.5 kB view details)

Uploaded CPython 3.5m

dedupe-1.8.2-cp34-cp34m-win_amd64.whl (52.7 kB view details)

Uploaded CPython 3.4m Windows x86-64

dedupe-1.8.2-cp34-cp34m-win32.whl (52.1 kB view details)

Uploaded CPython 3.4m Windows x86

dedupe-1.8.2-cp34-cp34m-manylinux1_x86_64.whl (78.3 kB view details)

Uploaded CPython 3.4m

dedupe-1.8.2-cp34-cp34m-manylinux1_i686.whl (74.7 kB view details)

Uploaded CPython 3.4m

dedupe-1.8.2-cp27-cp27mu-manylinux1_x86_64.whl (75.5 kB view details)

Uploaded CPython 2.7mu

dedupe-1.8.2-cp27-cp27mu-manylinux1_i686.whl (72.6 kB view details)

Uploaded CPython 2.7mu

dedupe-1.8.2-cp27-cp27m-win_amd64.whl (52.6 kB view details)

Uploaded CPython 2.7m Windows x86-64

dedupe-1.8.2-cp27-cp27m-win32.whl (51.8 kB view details)

Uploaded CPython 2.7m Windows x86

dedupe-1.8.2-cp27-cp27m-manylinux1_x86_64.whl (75.5 kB view details)

Uploaded CPython 2.7m

dedupe-1.8.2-cp27-cp27m-manylinux1_i686.whl (72.6 kB view details)

Uploaded CPython 2.7m

dedupe-1.8.2-cp27-cp27m-macosx_10_12_intel.whl (57.3 kB view details)

Uploaded CPython 2.7m macOS 10.12+ intel

File details

Details for the file dedupe-1.8.2.tar.gz.

File metadata

  • Download URL: dedupe-1.8.2.tar.gz
  • Upload date:
  • Size: 54.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for dedupe-1.8.2.tar.gz
Algorithm Hash digest
SHA256 7bb61fa1b59130ec2b9e4b98f108bbd46945331e24a1cc607373375072e791f4
MD5 bc66ec12c60174d970e3be88a1d4092a
BLAKE2b-256 fd0c68531b91e4787f0ff5847352ddcdef906e2496e15b55c87f3a67174867e0

See more details on using hashes here.

File details

Details for the file dedupe-1.8.2-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.8.2-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 a35d2b1e9f38e9a291bdc5f01ae94baa7cf513aedf717b15ac7173bd54b48e70
MD5 e102057a7d7297a7a2b05d4aa01c91a0
BLAKE2b-256 ab71ba4fda42ef99f49ccfc485ef335b62d0190af7450e62976f73e8a51bf52a

See more details on using hashes here.

File details

Details for the file dedupe-1.8.2-cp36-cp36m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.8.2-cp36-cp36m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 4f2bdf9d4d1301582778e1deeb35c1ce15e0bbc357990c29183e95a3a1c62449
MD5 7663872f9e34d60c5f375b25ca1a6097
BLAKE2b-256 f6e82ef5a8519b371ef5c039f3465e5b0187dcf126e57614ef62274a9ce2d659

See more details on using hashes here.

File details

Details for the file dedupe-1.8.2-cp36-cp36m-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.8.2-cp36-cp36m-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 708f8a2665449cd75567e5662e84342c0ee33d29d155d648e53b59e03c947ab6
MD5 c833dbef0400db8af42db2662fe72a38
BLAKE2b-256 49e03866703ab16db7ec4c395c3be0c8a67b5a98323683dec4d1ee672b55a8f5

See more details on using hashes here.

File details

Details for the file dedupe-1.8.2-cp35-cp35m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.8.2-cp35-cp35m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 e7669495e940978208b7ecb37e82b3b2e7b211af12d1e1ae642145cfa23aaaa8
MD5 a1e3cee23b27479a428bf854f7dea960
BLAKE2b-256 9a78239b285494caef50864e3c3418f052d742df5a7e1a9277708874fb894b85

See more details on using hashes here.

File details

Details for the file dedupe-1.8.2-cp35-cp35m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.8.2-cp35-cp35m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 c82ff7f92df7af08e4588accf1f1432268add4d5fcbf69519cd7f0f09bbcfccf
MD5 b6827c10a8a676cc54ed7ccc7b25e4dd
BLAKE2b-256 f1b2849964f99308e4b0c38c04cdde17e97245ff0f47d71af3e4ed61e7ae6e5d

See more details on using hashes here.

File details

Details for the file dedupe-1.8.2-cp34-cp34m-win_amd64.whl.

File metadata

File hashes

Hashes for dedupe-1.8.2-cp34-cp34m-win_amd64.whl
Algorithm Hash digest
SHA256 cdbff8aa3b83d8555c8a15ab508aac1be7f98e033c4440587533bb1025e88040
MD5 9ea2e3dfdf9e9de2a9d8477309f0a9eb
BLAKE2b-256 9b489960ca1b56d357bcb9529f5b2509a1d6237b9c13a008e63b54aa4c5550d1

See more details on using hashes here.

File details

Details for the file dedupe-1.8.2-cp34-cp34m-win32.whl.

File metadata

File hashes

Hashes for dedupe-1.8.2-cp34-cp34m-win32.whl
Algorithm Hash digest
SHA256 bff7b59fa31b39db4e2e1840c732bb04c19ed9b81511f5cc7d5b6083c6e0d04e
MD5 bded1b04c7f7667912f448aa7fbf5879
BLAKE2b-256 05d0e12ad0e79e2dab867896fd2fe36994f8b5e1e6bbacb5a73c7ae074fdf150

See more details on using hashes here.

File details

Details for the file dedupe-1.8.2-cp34-cp34m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.8.2-cp34-cp34m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 6f9047c33f431032ecb77b8514a9896efa64a8edab87bdef3db4ad828f112fad
MD5 f1aaedeee4a31bc79cae53eec1e38e50
BLAKE2b-256 a4fb3f52e5576dc960fc9dfb93cf09b0b3ccd73b7b659799aafb713e4c26dcb6

See more details on using hashes here.

File details

Details for the file dedupe-1.8.2-cp34-cp34m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.8.2-cp34-cp34m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 3814e77a8b3c48b2c88d1cd086c8443e6225e8a7949d55ec1e2738d1452afde0
MD5 69ee493a8e3277cdacead15df1aff562
BLAKE2b-256 53e10842c174ed79b1ad32f03067cb42db81c24ae29642079ab7f9e307c6a74f

See more details on using hashes here.

File details

Details for the file dedupe-1.8.2-cp27-cp27mu-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.8.2-cp27-cp27mu-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 f4c50518a8f1e80a1b317e2715c9d75a3233d82743873551f5b315f8ecaee743
MD5 d2677e1ef101ee147e971844c01a9f8d
BLAKE2b-256 d1a120e9dcd2753ac1f9445f748c93edd55e5bcbd7f3b40ad6e338765c29fc2f

See more details on using hashes here.

File details

Details for the file dedupe-1.8.2-cp27-cp27mu-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.8.2-cp27-cp27mu-manylinux1_i686.whl
Algorithm Hash digest
SHA256 be838fcb7db2a913a0b0f3b5a2e9ff6e859ef57430f7f71b4af4ea540279cb4d
MD5 536ebf1b998864e93b8fb37503a8f61c
BLAKE2b-256 76fb5df6f768747b5bc667780d1f3a5901a7dc18a68bd5ac319d45275d74cc9e

See more details on using hashes here.

File details

Details for the file dedupe-1.8.2-cp27-cp27m-win_amd64.whl.

File metadata

File hashes

Hashes for dedupe-1.8.2-cp27-cp27m-win_amd64.whl
Algorithm Hash digest
SHA256 a5c0db30d8285480c7bb82ee31f6ddaa2f60fce2fb9bdad5e03ee61e8092719f
MD5 02367453b949b245a719b2b209cee9af
BLAKE2b-256 44f6adcc4588d4e0a576a7b12cac544837d5794efa739995c6973f5db293e0fc

See more details on using hashes here.

File details

Details for the file dedupe-1.8.2-cp27-cp27m-win32.whl.

File metadata

File hashes

Hashes for dedupe-1.8.2-cp27-cp27m-win32.whl
Algorithm Hash digest
SHA256 5ac9b4dc6d8268f64861fa483fb14da1f32a07e6312ff5bdeb035ed39e9d049c
MD5 f39da2c69f9d7ac4739bd39e891fb696
BLAKE2b-256 bc2decc1561c13db4084b88240a26053fb0a3840f8c6227dbf9fe394fba03071

See more details on using hashes here.

File details

Details for the file dedupe-1.8.2-cp27-cp27m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.8.2-cp27-cp27m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 792d96001732fac5878dd03ca143d73d8d6ce2cc3f135c8aadb92160e702a094
MD5 40dc4b58e828d0bc0ae8b041a513261d
BLAKE2b-256 633c6f0f9bc8bab2276945658e6cdbb28f3932049d741e0d7362673e2362a653

See more details on using hashes here.

File details

Details for the file dedupe-1.8.2-cp27-cp27m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.8.2-cp27-cp27m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 5530bbefbcb609f3f67c2c0d327f4ce6b1f2504fb3825d18b65100d20ef7f79d
MD5 5e4abe4753b026bf290537fdef1c0e70
BLAKE2b-256 f72d3ad9212a8a30cc911c70f47841d309ae09f405d7f4e72d9d2dc184da2c0e

See more details on using hashes here.

File details

Details for the file dedupe-1.8.2-cp27-cp27m-macosx_10_12_intel.whl.

File metadata

File hashes

Hashes for dedupe-1.8.2-cp27-cp27m-macosx_10_12_intel.whl
Algorithm Hash digest
SHA256 c6a367fdec93890efa351bfba4d58c341ac2ee5c1ba5e6ec9e2dfedf507aa1cd
MD5 10a639c1085db28b8551e8716d1f9c85
BLAKE2b-256 bb4c8eb636b90544e589178f87028b1a9a0e27a8c0339e70969e032377961ae1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page