Skip to main content

A library to generate entity fingerprints.

Project description

fingerprints

package

This library helps with the generation of fingerprints for entity data. A fingerprint in this context is understood as a simplified entity identifier, derived from it's name or address and used for cross-referencing of entity across different datasets.

Usage

import fingerprints

fp = fingerprints.generate('Mr. Sherlock Holmes')
assert fp == 'holmes sherlock'

fp = fingerprints.generate('Siemens Aktiengesellschaft')
assert fp == 'ag siemens'

fp = fingerprints.generate('New York, New York')
assert fp == 'new york'

Company type names

A significant part of what fingerprints does it to recognize company legal form names. For example, fingerprints will be able to simplify Общество с ограниченной ответственностью to ООО, or Aktiengesellschaft to AG. The required database is based on two different sources:

Wikipedia also maintains an index of types of business entity.

See also

  • Clustering in Depth, part of the OpenRefine documentation discussing how to create collisions in data clustering.
  • probablepeople, parser for western names made by the brilliant folks at datamade.us.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fingerprints-1.0.3.tar.gz (13.2 kB view details)

Uploaded Source

Built Distribution

fingerprints-1.0.3-py2.py3-none-any.whl (13.2 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file fingerprints-1.0.3.tar.gz.

File metadata

  • Download URL: fingerprints-1.0.3.tar.gz
  • Upload date:
  • Size: 13.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/53.0.0 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.9.2

File hashes

Hashes for fingerprints-1.0.3.tar.gz
Algorithm Hash digest
SHA256 cafd5f92b5b91e4ce34af2b954da9c05b448a4778947785abb19a14f363352d0
MD5 0a4c56542198a851f10b831a41bd2113
BLAKE2b-256 860264e9cf0f71aca6cd133528c9315732ea8ac8a0011552d91360446d1da411

See more details on using hashes here.

File details

Details for the file fingerprints-1.0.3-py2.py3-none-any.whl.

File metadata

  • Download URL: fingerprints-1.0.3-py2.py3-none-any.whl
  • Upload date:
  • Size: 13.2 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/53.0.0 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.9.2

File hashes

Hashes for fingerprints-1.0.3-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 9d485aec44fbeeeda1e712f661cc6d96aa40e282d48c411e8d3175ea14742c6a
MD5 6f884098a8ac2d54a623b57db431115c
BLAKE2b-256 ad17309d6bff8ad23902be7a75c8dc7137c608456f09bd999da7f58f2c626be7

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page