Skip to main content

A library to generate entity fingerprints.

Project description

fingerprints

package

This library helps with the generation of fingerprints for entity data. A fingerprint in this context is understood as a simplified entity identifier, derived from it's name or address and used for cross-referencing of entity across different datasets.

Usage

import fingerprints

fp = fingerprints.generate('Mr. Sherlock Holmes')
assert fp == 'holmes sherlock'

fp = fingerprints.generate('Siemens Aktiengesellschaft')
assert fp == 'ag siemens'

fp = fingerprints.generate('New York, New York')
assert fp == 'new york'

Company type names

A significant part of what fingerprints does it to recognize company legal form names. For example, fingerprints will be able to simplify Общество с ограниченной ответственностью to ООО, or Aktiengesellschaft to AG. The required database is based on two different sources:

Wikipedia also maintains an index of types of business entity.

See also

  • Clustering in Depth, part of the OpenRefine documentation discussing how to create collisions in data clustering.
  • probablepeople, parser for western names made by the brilliant folks at datamade.us.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fingerprints-1.0.1.tar.gz (12.7 kB view details)

Uploaded Source

Built Distribution

fingerprints-1.0.1-py2.py3-none-any.whl (12.8 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file fingerprints-1.0.1.tar.gz.

File metadata

  • Download URL: fingerprints-1.0.1.tar.gz
  • Upload date:
  • Size: 12.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.0.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.7

File hashes

Hashes for fingerprints-1.0.1.tar.gz
Algorithm Hash digest
SHA256 b180a4868d53c2626b3b14004faaa18a23f0596c1de130676f963bd749153869
MD5 a503dd59707f7d84a23100bf3adcb138
BLAKE2b-256 80706ce12abda0d54fec3508afb7bdf1746e6097c9da74e817030d6fff386263

See more details on using hashes here.

File details

Details for the file fingerprints-1.0.1-py2.py3-none-any.whl.

File metadata

  • Download URL: fingerprints-1.0.1-py2.py3-none-any.whl
  • Upload date:
  • Size: 12.8 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.0.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.7

File hashes

Hashes for fingerprints-1.0.1-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 d5b6b53c5806b6dfd967d935e4767eb86bcbade59c05b2bfe463711969adcc69
MD5 4947cf01b34fbc335f72c3a1cc55f5f6
BLAKE2b-256 d5e526e6f5b997101c686b754b30da10d5a0b2dbe68e56023c8ce46bd2e39b35

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page