Skip to main content

A library to generate entity fingerprints.

Project description

fingerprints

package

This library helps with the generation of fingerprints for entity data. A fingerprint in this context is understood as a simplified entity identifier, derived from it's name or address and used for cross-referencing of entity across different datasets.

Usage

import fingerprints

fp = fingerprints.generate('Mr. Sherlock Holmes')
assert fp == 'holmes sherlock'

fp = fingerprints.generate('Siemens Aktiengesellschaft')
assert fp == 'ag siemens'

fp = fingerprints.generate('New York, New York')
assert fp == 'new york'

Company type names

A significant part of what fingerprints does it to recognize company legal form names. For example, fingerprints will be able to simplify Общество с ограниченной ответственностью to ООО, or Aktiengesellschaft to AG. The required database is based on two different sources:

Wikipedia also maintains an index of types of business entity.

See also

  • Clustering in Depth, part of the OpenRefine documentation discussing how to create collisions in data clustering.
  • probablepeople, parser for western names made by the brilliant folks at datamade.us.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fingerprints-1.0.2.tar.gz (12.8 kB view details)

Uploaded Source

Built Distribution

fingerprints-1.0.2-py2.py3-none-any.whl (13.2 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file fingerprints-1.0.2.tar.gz.

File metadata

  • Download URL: fingerprints-1.0.2.tar.gz
  • Upload date:
  • Size: 12.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/53.0.0 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.9.2

File hashes

Hashes for fingerprints-1.0.2.tar.gz
Algorithm Hash digest
SHA256 6f488eda41dbc43c3f104ef1b586c2069d621e41515e1045b19b00510757ea13
MD5 36b6a1e65dc9eb50aa757ddeef2f537e
BLAKE2b-256 ae9cb097558b95089123fefa4e8978ec1f35ddb94ded2172b090bc3e21f8b9e3

See more details on using hashes here.

File details

Details for the file fingerprints-1.0.2-py2.py3-none-any.whl.

File metadata

  • Download URL: fingerprints-1.0.2-py2.py3-none-any.whl
  • Upload date:
  • Size: 13.2 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/53.0.0 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.9.2

File hashes

Hashes for fingerprints-1.0.2-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 ff4b3c19614e311baba410775f12d68782119ccc39700bfd60b846e0f949e33a
MD5 e93f26e6b64c3c7da7c5aa78cd1cc627
BLAKE2b-256 4ad9db29a760598bae7b1cdceba87ff49c9899c29116ba0f27c635368d501aa8

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page