Skip to main content

A python library for accurate and scaleable data deduplication and entity-resolution

Project description

dedupe is a library that uses machine learning to perform de-duplication and entity resolution quickly on structured data. dedupe is the open source engine for dedupe.io

dedupe will help you:

  • remove duplicate entries from a spreadsheet of names and addresses

  • link a list with customer information to another with order history, even without unique customer id’s

  • take a database of campaign contributions and figure out which ones were made by the same person, even if the names were entered slightly differently for each record

dedupe takes in human training data and comes up with the best rules for your dataset to quickly and automatically find similar records, even with very large databases.

Important links:

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dedupe-1.7.7.tar.gz (50.5 kB view details)

Uploaded Source

Built Distributions

dedupe-1.7.7-cp36-cp36m-manylinux1_x86_64.whl (77.1 kB view details)

Uploaded CPython 3.6m

dedupe-1.7.7-cp36-cp36m-manylinux1_i686.whl (73.7 kB view details)

Uploaded CPython 3.6m

dedupe-1.7.7-cp36-cp36m-macosx_10_11_x86_64.whl (52.0 kB view details)

Uploaded CPython 3.6m macOS 10.11+ x86-64

dedupe-1.7.7-cp35-cp35m-manylinux1_x86_64.whl (76.9 kB view details)

Uploaded CPython 3.5m

dedupe-1.7.7-cp35-cp35m-manylinux1_i686.whl (73.5 kB view details)

Uploaded CPython 3.5m

dedupe-1.7.7-cp34-cp34m-win_amd64.whl (52.6 kB view details)

Uploaded CPython 3.4m Windows x86-64

dedupe-1.7.7-cp34-cp34m-win32.whl (51.9 kB view details)

Uploaded CPython 3.4m Windows x86

dedupe-1.7.7-cp34-cp34m-manylinux1_x86_64.whl (77.0 kB view details)

Uploaded CPython 3.4m

dedupe-1.7.7-cp34-cp34m-manylinux1_i686.whl (73.7 kB view details)

Uploaded CPython 3.4m

dedupe-1.7.7-cp27-cp27mu-manylinux1_x86_64.whl (74.8 kB view details)

Uploaded CPython 2.7mu

dedupe-1.7.7-cp27-cp27mu-manylinux1_i686.whl (72.1 kB view details)

Uploaded CPython 2.7mu

dedupe-1.7.7-cp27-cp27m-win_amd64.whl (52.7 kB view details)

Uploaded CPython 2.7m Windows x86-64

dedupe-1.7.7-cp27-cp27m-win32.whl (51.8 kB view details)

Uploaded CPython 2.7m Windows x86

dedupe-1.7.7-cp27-cp27m-manylinux1_x86_64.whl (74.8 kB view details)

Uploaded CPython 2.7m

dedupe-1.7.7-cp27-cp27m-manylinux1_i686.whl (72.1 kB view details)

Uploaded CPython 2.7m

dedupe-1.7.7-cp27-cp27m-macosx_10_11_x86_64.whl (51.6 kB view details)

Uploaded CPython 2.7m macOS 10.11+ x86-64

File details

Details for the file dedupe-1.7.7.tar.gz.

File metadata

  • Download URL: dedupe-1.7.7.tar.gz
  • Upload date:
  • Size: 50.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for dedupe-1.7.7.tar.gz
Algorithm Hash digest
SHA256 bf94e55fa982ab9642e52d409a1174f3529aafaa8da51d49da4e9ea243dea544
MD5 6f5f85c9e2c6bc72cca8092098cba532
BLAKE2b-256 556ef034cf6cd390a4073c3b71cc9303d0f058e72ecdd31301c0abcddf7550db

See more details on using hashes here.

File details

Details for the file dedupe-1.7.7-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.7-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 4fbeec311e966241196b02d8b5fafe73c574d81823dc056fa83848d2566e76a8
MD5 24c87d66faabd63a6e0ba51de73bb67b
BLAKE2b-256 3ce969f773e15ee778649e8da49033445f2ebb74cac2b06d14a0ba966a732ae5

See more details on using hashes here.

File details

Details for the file dedupe-1.7.7-cp36-cp36m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.7.7-cp36-cp36m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 0f5a784f905e972717440e71f388b938c7aa41c495bad9e5db1baec748323146
MD5 a4a452bae07fa7095b2c67875f88de2a
BLAKE2b-256 98711da2c7c0be895d3eb15f0dd1027f337412420e5cc31b3b0f8059065210bc

See more details on using hashes here.

File details

Details for the file dedupe-1.7.7-cp36-cp36m-macosx_10_11_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.7-cp36-cp36m-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 ebee4960f178adc4cea858380362ee5d7c499a8f2242bd7b60054ce0605bf245
MD5 9416c02f89b1c63d013e587034a15b8a
BLAKE2b-256 8a97c21a445088a63256321548d5c04d3ba18261c97e836050bd8b1ca32221c8

See more details on using hashes here.

File details

Details for the file dedupe-1.7.7-cp35-cp35m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.7-cp35-cp35m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 1fe8d39efd6d3fa41e65eae3f8783af0b7d07a451318e80f34dd59cd758468f7
MD5 e2fa51e56d91efa1fec96fef5d503504
BLAKE2b-256 fe068042c9a672a936abe6ecd5ef8152f41a24c3b5b77a457e7305c950d16721

See more details on using hashes here.

File details

Details for the file dedupe-1.7.7-cp35-cp35m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.7.7-cp35-cp35m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 547afe26566301792fe194ad996c870a11c31ccff53255a7622bd49d8a717514
MD5 ff062c6a146d27547f17fc8ada5b979f
BLAKE2b-256 afba34f7a951246967278e8d83ca23fad8f7642366ab337e1cb4ada31b6351ea

See more details on using hashes here.

File details

Details for the file dedupe-1.7.7-cp34-cp34m-win_amd64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.7-cp34-cp34m-win_amd64.whl
Algorithm Hash digest
SHA256 fc94469b41b853d2b02833ff21b96d0e726fe078b330da5ddff7e340f5a02ac5
MD5 028f2b4595ced75371f7a769a370783d
BLAKE2b-256 30bac84266530abb086a3df8c10f9fe7b092c00f98bd020bc973685d9f32c5ef

See more details on using hashes here.

File details

Details for the file dedupe-1.7.7-cp34-cp34m-win32.whl.

File metadata

File hashes

Hashes for dedupe-1.7.7-cp34-cp34m-win32.whl
Algorithm Hash digest
SHA256 cde87004c317d840462878c3dca788a08f9ba92f9b87b4fc6c1331c4b31d438b
MD5 cdd5a147231e9db7c7e61453a1ab9ae8
BLAKE2b-256 87018903877fba4a88aa38a34518151d7f6e786d28af31e179df2a2c5ef67259

See more details on using hashes here.

File details

Details for the file dedupe-1.7.7-cp34-cp34m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.7-cp34-cp34m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 5e410919269744ecedc4198a1656fa37710b4de88b1f2a8c1a82d3277994e1a5
MD5 bc03492ae85f612d736980f19b786c98
BLAKE2b-256 25f5b47e07a21a0b55b9952783fb8ac988fe644b709d7f00a1c45af7127415db

See more details on using hashes here.

File details

Details for the file dedupe-1.7.7-cp34-cp34m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.7.7-cp34-cp34m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 8ab218ff114dbcf9e66dccc1aab4fea955cdce0135c76dccd9c7fcfa8fcc76d0
MD5 bd96fdbc2faeff53145b358c0e967726
BLAKE2b-256 5034731e339deaa84c81a4490e4bc4c93fe52162502b84ce4a42877826c1f4d5

See more details on using hashes here.

File details

Details for the file dedupe-1.7.7-cp27-cp27mu-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.7-cp27-cp27mu-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 fc14ed452b3afa899c6aba561d13a1eb19dd3025a55527741c69afd436512e52
MD5 e524bb72c705f720ec0718a13f9eb67d
BLAKE2b-256 4cb549101de614d9d8a680e281b6108f7e63b3d5d2d14745205d6ffa5fd8008a

See more details on using hashes here.

File details

Details for the file dedupe-1.7.7-cp27-cp27mu-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.7.7-cp27-cp27mu-manylinux1_i686.whl
Algorithm Hash digest
SHA256 2545523ddc5912b101299b9ed20696d656631b9284d81e2f8a5d921370b24d97
MD5 c8e73c6205e6b9d52ed807d21ed8582a
BLAKE2b-256 b5fd5f8b05cc47639e776e7aa8a516827dc49f69560635bbf0f3bf24c0882a40

See more details on using hashes here.

File details

Details for the file dedupe-1.7.7-cp27-cp27m-win_amd64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.7-cp27-cp27m-win_amd64.whl
Algorithm Hash digest
SHA256 8c80beadea1f1251dd3d9c7fcbde98edb0169f4c1421ae91c4b90103488ee0c5
MD5 8857d7aa7a7403c13ef3055ca2edce77
BLAKE2b-256 d8c38389bf3c5b10e9a9db1e053300d7f1119450a43afb83afc08fd7222abdce

See more details on using hashes here.

File details

Details for the file dedupe-1.7.7-cp27-cp27m-win32.whl.

File metadata

File hashes

Hashes for dedupe-1.7.7-cp27-cp27m-win32.whl
Algorithm Hash digest
SHA256 708918d35d755370b570af41e08572d53d348f8964ad2f242fec07aae7191d02
MD5 df3ac68bca4412d4f58ede43d3c9af93
BLAKE2b-256 f6c7fc4a4bec650118df245873e38638bcb51756e89d04e4c94a66db1ff8f998

See more details on using hashes here.

File details

Details for the file dedupe-1.7.7-cp27-cp27m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.7-cp27-cp27m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 6655842a4fe3fd803283d69d7ae08143ceb63fb3187032a4895a2f510273b379
MD5 42c57952539a4eb65ab93e2b6e021e00
BLAKE2b-256 656de94fb0db461316366d2328d92bb8c2a542b2769d143e20fad2f2155aa36e

See more details on using hashes here.

File details

Details for the file dedupe-1.7.7-cp27-cp27m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for dedupe-1.7.7-cp27-cp27m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 bc1449e777f8918e00ea0245cd01c3c100f17c8c4d3f7763891ff5eb2c39a9f0
MD5 772797290702e4e952535fc46cb73a8c
BLAKE2b-256 11b7b2a2bc52766ceb834ea0924ad0e359b7b0ec793cdd279b4dee68870bf1a8

See more details on using hashes here.

File details

Details for the file dedupe-1.7.7-cp27-cp27m-macosx_10_11_x86_64.whl.

File metadata

File hashes

Hashes for dedupe-1.7.7-cp27-cp27m-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 357a0010c857aafd8e2f5ddeabd4ba0469f4c3cc73e1eded2dd1875d3618b3ea
MD5 f0eb2252488199f5829a13f6619dc78d
BLAKE2b-256 31f33799a846e05d2296c4503f40bc9be52831b5787efa5b9556ae968571c0de

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page