Skip to main content

A library to generate entity fingerprints.

Project description

fingerprints

package

This library helps with the generation of fingerprints for entity data. A fingerprint in this context is understood as a simplified entity identifier, derived from it's name or address and used for cross-referencing of entity across different datasets.

Usage

import fingerprints

fp = fingerprints.generate('Mr. Sherlock Holmes')
assert fp == 'holmes sherlock'

fp = fingerprints.generate('Siemens Aktiengesellschaft')
assert fp == 'ag siemens'

fp = fingerprints.generate('New York, New York')
assert fp == 'new york'

Company type names

A significant part of what fingerprints does it to recognize company legal form names. For example, fingerprints will be able to simplify Общество с ограниченной ответственностью to ООО, or Aktiengesellschaft to AG. The required database is based on two different sources:

Wikipedia also maintains an index of types of business entity.

See also

  • Clustering in Depth, part of the OpenRefine documentation discussing how to create collisions in data clustering.
  • probablepeople, parser for western names made by the brilliant folks at datamade.us.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fingerprints-1.2.2.tar.gz (16.3 kB view details)

Uploaded Source

Built Distribution

fingerprints-1.2.2-py2.py3-none-any.whl (17.1 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file fingerprints-1.2.2.tar.gz.

File metadata

  • Download URL: fingerprints-1.2.2.tar.gz
  • Upload date:
  • Size: 16.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.18

File hashes

Hashes for fingerprints-1.2.2.tar.gz
Algorithm Hash digest
SHA256 70a8960b9fce2920e2e43b598ef95ef25263301190553326466b070c36ed8f6d
MD5 65b98fdfe4a676d94e40db4825757833
BLAKE2b-256 08000a0f42c62fdf9d0a94cf2deac93417b72328ff3b904312007cd0f3a66bd7

See more details on using hashes here.

File details

Details for the file fingerprints-1.2.2-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for fingerprints-1.2.2-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 e1c978288124e4f7b018bdf94c1855d3eae91fea921e72a57b5e8665dc2bb6dd
MD5 a06f30f7778f1581c3deb603c654dfdb
BLAKE2b-256 a27cf0a5333e097eafa3013b1ffa5ddea51d798e63d46e73f2cb516b68a17abc

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page