Skip to main content

Parse romanized names & companies using advanced NLP methods

Project description

probablepeople is a python library for parsing unstructured romanized name or company strings into components, using conditional random fields.

From the python interpreter:

>>> import probablepeople
>>> probablepeople.parse('Mr George "Gob" Bluth II')
[('Mr', 'PrefixMarital'),
 ('George', 'GivenName'),
 ('"Gob"', 'Nickname'),
 ('Bluth', 'Surname'),
 ('II', 'SuffixGenerational')]
>>> probablepeople.parse('Sitwell Housing Inc')
[('Sitwell', 'CorporationName'),
 ('Housing', 'CorporationName'),
 ('Inc', 'CorporationLegalType')]

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

probablepeople-0.4.1.tar.gz (676.3 kB view details)

Uploaded Source

Built Distribution

probablepeople-0.4.1-py2-none-any.whl (687.8 kB view details)

Uploaded Python 2

File details

Details for the file probablepeople-0.4.1.tar.gz.

File metadata

File hashes

Hashes for probablepeople-0.4.1.tar.gz
Algorithm Hash digest
SHA256 bcf8bd00b62064c5db3ad06edf3b8f7985f4404ed3a9c13bdea19da2d6aee744
MD5 750c08211b9c89de5a1720e2c2105ed2
BLAKE2b-256 92ead7a28c46b84baa213244c4f306e71db8c97ebb9b3751dcb6b9bd75954b8d

See more details on using hashes here.

File details

Details for the file probablepeople-0.4.1-py2-none-any.whl.

File metadata

File hashes

Hashes for probablepeople-0.4.1-py2-none-any.whl
Algorithm Hash digest
SHA256 e1fd03570f27383af8621ade45645ef5145c3facc4926b596ef3738a748819de
MD5 ec4a4f6f1447580adc9eca5a47ff1d6d
BLAKE2b-256 43246fcb408d401a9dbee24e735130a3fe30b81e6439031d1a417b9c8ecb690a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page