Skip to main content

Parse romanized names & companies using advanced NLP methods

Project description

probablepeople is a python library for parsing unstructured romanized name or company strings into components, using conditional random fields.

From the python interpreter:

>>> import probablepeople
>>> probablepeople.parse('Mr George "Gob" Bluth II')
[('Mr', 'PrefixMarital'),
 ('George', 'GivenName'),
 ('"Gob"', 'Nickname'),
 ('Bluth', 'Surname'),
 ('II', 'SuffixGenerational')]
>>> probablepeople.parse('Sitwell Housing Inc')
[('Sitwell', 'CorporationName'),
 ('Housing', 'CorporationName'),
 ('Inc', 'CorporationLegalType')]

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

probablepeople-0.4.2.tar.gz (676.3 kB view details)

Uploaded Source

Built Distribution

probablepeople-0.4.2-py2-none-any.whl (687.8 kB view details)

Uploaded Python 2

File details

Details for the file probablepeople-0.4.2.tar.gz.

File metadata

File hashes

Hashes for probablepeople-0.4.2.tar.gz
Algorithm Hash digest
SHA256 3b53403082a7eb67c7658bbcc9fe53c921b317ca89aa7298615228c25d18922e
MD5 e64e343bd678d02980708fcd63ab3e62
BLAKE2b-256 175e0487383841a250a87c34ec89f1b4b90f9c6fff5980f7a97b3934e44e20b5

See more details on using hashes here.

File details

Details for the file probablepeople-0.4.2-py2-none-any.whl.

File metadata

File hashes

Hashes for probablepeople-0.4.2-py2-none-any.whl
Algorithm Hash digest
SHA256 8b3bc3446dc9af2d6e2698e7beb83a33addcdf3be231f76d202f0001706a91df
MD5 9593196df4a8199027397ee29a99594a
BLAKE2b-256 e7113262dca1fdad172356a2c6a7f1c1c5133b961a65a475aa02dc6d651d9737

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page