Skip to main content

A python library for accurate and scaleable data deduplication and entity-resolution

Project description

dedupe is a library that uses machine learning to perform de-duplication and entity resolution quickly on structured data. dedupe is the open source engine for dedupe.io

dedupe will help you:

  • remove duplicate entries from a spreadsheet of names and addresses

  • link a list with customer information to another with order history, even without unique customer id’s

  • take a database of campaign contributions and figure out which ones were made by the same person, even if the names were entered slightly differently for each record

dedupe takes in human training data and comes up with the best rules for your dataset to quickly and automatically find similar records, even with very large databases.

Important links:

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dedupe-1.7.9.tar.gz (54.3 kB view details)

Uploaded Source

Built Distributions

dedupe-1.7.9-cp36-cp36m-manylinux1_i686.whl (76.5 kB view details)

Uploaded CPython 3.6m

dedupe-1.7.9-cp36-cp36m-macosx_10_11_x86_64.whl (52.6 kB view details)

Uploaded CPython 3.6m macOS 10.11+ x86-64

dedupe-1.7.9-cp35-cp35m-manylinux1_i686.whl (76.2 kB view details)

Uploaded CPython 3.5m

dedupe-1.7.9-cp34-cp34m-win_amd64.whl (52.6 kB view details)

Uploaded CPython 3.4m Windows x86-64

dedupe-1.7.9-cp34-cp34m-win32.whl (51.9 kB view details)

Uploaded CPython 3.4m Windows x86

dedupe-1.7.9-cp34-cp34m-manylinux1_i686.whl (73.7 kB view details)

Uploaded CPython 3.4m

dedupe-1.7.9-cp27-cp27mu-manylinux1_i686.whl (72.1 kB view details)

Uploaded CPython 2.7mu

dedupe-1.7.9-cp27-cp27m-win_amd64.whl (52.6 kB view details)

Uploaded CPython 2.7m Windows x86-64

dedupe-1.7.9-cp27-cp27m-win32.whl (51.7 kB view details)

Uploaded CPython 2.7m Windows x86

dedupe-1.7.9-cp27-cp27m-manylinux1_i686.whl (72.1 kB view details)

Uploaded CPython 2.7m

dedupe-1.7.9-cp27-cp27m-macosx_10_11_x86_64.whl (51.5 kB view details)

Uploaded CPython 2.7m macOS 10.11+ x86-64

File details

Details for the file dedupe-1.7.9.tar.gz.

File metadata

  • Download URL: dedupe-1.7.9.tar.gz
  • Upload date:
  • Size: 54.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for dedupe-1.7.9.tar.gz
Algorithm Hash digest
SHA256 cc25f1938d33566a04f2cccbf6f8755b78bb6084de7777683e834cf010bb4468
MD5 07d58d0de3e70f44bb85feca0a1d9a82
BLAKE2b-256 a6b092081a950424cfc7c7b7114995049422f64d95a204f4c728a603f692c160

See more details on using hashes here.

File details

Details for the file dedupe-1.7.9-cp36-cp36m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.7.9-cp36-cp36m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 4292da9cb5df86e8a457993dab4d29d7cc04ac4e29e008b8fec6f6882f10188c
MD5 350e83f018a03fdfc8b1fcc04af52ea8
BLAKE2b-256 4de6c7fe1da97266c6a1a1f08513337135198e5525372014426584a47e6e265b

See more details on using hashes here.

File details

Details for the file dedupe-1.7.9-cp36-cp36m-macosx_10_11_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.9-cp36-cp36m-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 b15162b06ec313b7231970fd4fc385a69dfad7c75cdae80f051d4821c65d6fc9
MD5 5c59a5d7e4b0df2d30d052ea4a16a030
BLAKE2b-256 97f2fee57990d0ac2bce78cb8d8a3ce6e7bb1340ffc49b51cce414efc168c3ee

See more details on using hashes here.

File details

Details for the file dedupe-1.7.9-cp35-cp35m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.7.9-cp35-cp35m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 b5d6b26062406a826ec94d6910893f7fd00d0a29e719733b754c7c2b68d5e7cd
MD5 6d0665a50c79d73f0e2dd56b42356293
BLAKE2b-256 1dd045687366fe52ec1c11548e9f7b20fe3720a846ad22f8885e66c2035636b4

See more details on using hashes here.

File details

Details for the file dedupe-1.7.9-cp34-cp34m-win_amd64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.9-cp34-cp34m-win_amd64.whl
Algorithm Hash digest
SHA256 9dc8a3706d1f791b99bd39a8b8f6044b208dd6a17ca816c1d854fb3694da07ba
MD5 4c6a523d58912cd8e3f1aaf124116d56
BLAKE2b-256 eddc1da6db01b3209f8daf75b668e232f5321e3652070eb3c5da1984737c1140

See more details on using hashes here.

File details

Details for the file dedupe-1.7.9-cp34-cp34m-win32.whl.

File metadata

File hashes

Hashes for dedupe-1.7.9-cp34-cp34m-win32.whl
Algorithm Hash digest
SHA256 8503251f546fa79424352c49b632e85ed823b67deea775a6c13487986c2d0009
MD5 b2b39b47f076e5901f4603096d65e62f
BLAKE2b-256 8b6b84a3c0d79ff3c3f02d2e87c8016b3b62b94d7d18d58ae144a899b9cc062e

See more details on using hashes here.

File details

Details for the file dedupe-1.7.9-cp34-cp34m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.7.9-cp34-cp34m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 03e14e53c3817508529ec2735f12d0f26e99e680e226968963a027d74f76b472
MD5 c1a9d229c7cf7363c754bb94af3c4236
BLAKE2b-256 78ce9e9c93f1f6c3c80702df3e75dc67bfa0d26b82f14c405b04a99081989e71

See more details on using hashes here.

File details

Details for the file dedupe-1.7.9-cp27-cp27mu-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.7.9-cp27-cp27mu-manylinux1_i686.whl
Algorithm Hash digest
SHA256 e0b074d765a7486ebd07790ec09ddb531930fa3c903ffafa09bd3f552ef7ba78
MD5 8a56c97d32c0d7a09ff45439560517d0
BLAKE2b-256 7da4a61e8baa4d360fb824a05bd8dfe3d7ad1668b097400b226ecc6529ec0d71

See more details on using hashes here.

File details

Details for the file dedupe-1.7.9-cp27-cp27m-win_amd64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.9-cp27-cp27m-win_amd64.whl
Algorithm Hash digest
SHA256 e560b31b4862995d66ecbb043d6989c56349112b7dc0949cda5cb91b4d634860
MD5 bfe70b07e64b465864f7d2c63a1a2192
BLAKE2b-256 c63ea622371a437faf6c5c2a8520e098f03651473c847eaf2988150e4238b864

See more details on using hashes here.

File details

Details for the file dedupe-1.7.9-cp27-cp27m-win32.whl.

File metadata

File hashes

Hashes for dedupe-1.7.9-cp27-cp27m-win32.whl
Algorithm Hash digest
SHA256 95df1b2b0e9044999d464fb71748306164e1f5f236f4ed8a611d7a7377e5298d
MD5 9948d8061dbc7c53e52b1d00d3920110
BLAKE2b-256 045ab3a4942e108820793842c34db58b5e89d69e999822de09f0bfa2da2ae4f2

See more details on using hashes here.

File details

Details for the file dedupe-1.7.9-cp27-cp27m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.7.9-cp27-cp27m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 e09731f41301d9998a3e734b730e8d0208f7cbfd523b8734cb5c7925076ce898
MD5 168422eb1872fc3693be036a23e1b768
BLAKE2b-256 ce3298df7b1af78ad8543d57b1236b5ce3df4d4884eb253ca724a8d511c711d0

See more details on using hashes here.

File details

Details for the file dedupe-1.7.9-cp27-cp27m-macosx_10_11_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.9-cp27-cp27m-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 cd87b13440919437a10339549307f6a828d2b13ba2bd5756979525c9031c1889
MD5 40cf095a1e71a41ebfb030e81799f9f4
BLAKE2b-256 d61867cecc556823d27ae71ffc4985a601e52ef2e42d6b908ceed85dc376e409

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page