Skip to main content

Parse romanized names & companies using advanced NLP methods

Project description

probablepeople is a python library for parsing unstructured romanized name or company strings into components, using conditional random fields.

From the python interpreter:

>>> import probablepeople
>>> probablepeople.parse('Mr George "Gob" Bluth II')
[('Mr', 'PrefixMarital'),
 ('George', 'GivenName'),
 ('"Gob"', 'Nickname'),
 ('Bluth', 'Surname'),
 ('II', 'SuffixGenerational')]
>>> probablepeople.parse('Sitwell Housing Inc')
[('Sitwell', 'CorporationName'),
 ('Housing', 'CorporationName'),
 ('Inc', 'CorporationLegalType')]

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

probablepeople-0.5.tar.gz (1.1 MB view details)

Uploaded Source

Built Distribution

probablepeople-0.5-py2-none-any.whl (1.1 MB view details)

Uploaded Python 2

File details

Details for the file probablepeople-0.5.tar.gz.

File metadata

File hashes

Hashes for probablepeople-0.5.tar.gz
Algorithm Hash digest
SHA256 82ee2d1bd40b6feb3e809551d956349a88e1db51d2195b5eaf0ff2b744a67de6
MD5 950111b62236636153fe369f7c1e31c4
BLAKE2b-256 ad16f2b40816c97dac6a16f6f3aa1977f30e449bfd070dde05e3d07da6e01224

See more details on using hashes here.

File details

Details for the file probablepeople-0.5-py2-none-any.whl.

File metadata

File hashes

Hashes for probablepeople-0.5-py2-none-any.whl
Algorithm Hash digest
SHA256 7580a618fd637b6c6d1351a759570f472089c4de402be1850d9b2446aff8b16c
MD5 cf5462e68f1de6693a14938b540e8169
BLAKE2b-256 10d4d1b962c2fa7aa32081c059f9afefdd93fbebf5b8559d00b124d025c95607

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page