Skip to main content

A library to generate entity fingerprints.

Project description

fingerprints

package

This library helps with the generation of fingerprints for entity data. A fingerprint in this context is understood as a simplified entity identifier, derived from it's name or address and used for cross-referencing of entity across different datasets.

Usage

import fingerprints

fp = fingerprints.generate('Mr. Sherlock Holmes')
assert fp == 'holmes sherlock'

fp = fingerprints.generate('Siemens Aktiengesellschaft')
assert fp == 'ag siemens'

fp = fingerprints.generate('New York, New York')
assert fp == 'new york'

Company type names

A significant part of what fingerprints does it to recognize company legal form names. For example, fingerprints will be able to simplify Общество с ограниченной ответственностью to ООО, or Aktiengesellschaft to AG. The required database is based on two different sources:

Wikipedia also maintains an index of types of business entity.

See also

  • Clustering in Depth, part of the OpenRefine documentation discussing how to create collisions in data clustering.
  • probablepeople, parser for western names made by the brilliant folks at datamade.us.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fingerprints-1.2.0.tar.gz (15.9 kB view details)

Uploaded Source

Built Distribution

fingerprints-1.2.0-py2.py3-none-any.whl (16.7 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file fingerprints-1.2.0.tar.gz.

File metadata

  • Download URL: fingerprints-1.2.0.tar.gz
  • Upload date:
  • Size: 15.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.18

File hashes

Hashes for fingerprints-1.2.0.tar.gz
Algorithm Hash digest
SHA256 c2522f66bc98f0afc61282edab65e57e39d8e4ec5dc6b0cf9fe2f1592ac57af3
MD5 ec24a14c653cdfa11fc617f6599fda52
BLAKE2b-256 26cb6bb585f97394d81770922b7de9ee3cc295cd90abe472ce78fb0bd9d5defb

See more details on using hashes here.

File details

Details for the file fingerprints-1.2.0-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for fingerprints-1.2.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 7cf5a7bb8982b4172a70f490ba1d9495ee998be2b5aa10640119e379dcced044
MD5 e422f14588cd0295a34c8844413fca47
BLAKE2b-256 1f8c02227b2bbb9d250d626449a316134e145c0d0c42ac85ff79adb4439bfea5

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page