Skip to main content

parse numbers written in natural language

Project description

Supported Python Versions

number-parser is a simple library that allows you to convert numbers written in the natural language to it’s equivalent numeric forms. It currently supports cardinal numbers in the following languages - English, Hindi, Spanish, Ukrainian and Russian and ordinal numbers in English.

Installation

pip install number-parser

number-parser requires Python 3.7+.

Usage

The library provides the following common usages.

Converting numbers in-place

Identifying the numbers in a text string, converting them to corresponding numeric values while ignoring non-numeric words. This also supports ordinal number conversion (for English only).

>>> from number_parser import parse
>>> parse("I have two hats and thirty seven coats")
'I have 2 hats and 37 coats'
>>> parse("One, Two, Three go")
'1, 2, 3 go'
>>> parse("First day of year two thousand")
'1 day of year 2000'

Parsing a number

Converting a single number written in words to it’s corresponding integer.

>>> from number_parser import parse_number
>>> parse_number("two thousand and twenty")
2020
>>> parse_number("not_a_number")

Parsing an ordinal

Converting a single ordinal number written in words to its corresponding integer. (Support for English only)

>>> from number_parser import parse_ordinal
>>> parse_ordinal("twenty third")
23
>>> parse_ordinal("seventy fifth")
75

Parsing a fraction

Converting a fractional number written in words to its corresponding integral fraction. (Support for English only)

>>> from number_parser import parse_fraction
>>> parse_fraction("forty two divided by five hundred and six")
'42/506'
>>> parse_fraction("one over two")
'1/2'
>>> parse_fraction("forty two / one million")
'42/1000000'

Language Support

The default language is English, you can pass the language parameter with corresponding locale for other languages. It currently supports cardinal numbers in the following languages - English, Hindi, Spanish, Ukrainian and Russian and ordinal numbers in English.

>>> from number_parser import parse, parse_number
>>> parse("Hay tres gallinas y veintitrés patos", language='es')
'Hay 3 gallinas y 23 patos'
>>> parse_number("चौदह लाख बत्तीस हज़ार पाँच सौ चौबीस", language='hi')
1432524

Supported cases

The library has extensive tests. Some of the supported cases are described below.

Accurately handling usage of conjunction while forming the number.

>>> parse("doscientos cincuenta y doscientos treinta y uno y doce", language='es')
'250 y 231 y 12'

Handling ambiguous cases without proper separators.

>>> parse("two thousand thousand")
'2000 1000'
>>> parse_number("two thousand two million")
2002000000

Handling nuances in the languag ith different forms of the same number.

>>> parse_number("пятисот девяноста шести", language='ru')
596
>>> parse_number("пятистам девяноста шести", language='ru')
596
>>> parse_number("пятьсот девяносто шесть", language='ru')
596

Contributing

Changes

0.3.2 (2023-03-28)

Fix: - Fix import bug (#91)

0.3.1 (2023-03-22)

Improvements: - Add Python 3.10, 3.11 support (#83) - Add __version__ (#87) - Replace OrderedDict with dict (#88)

Fix: - Inconsistent white space handling around sentence separators following numbers (#76, #77)

0.3.0 (2022-10-20)

Improvements: - Added support for bigger numbers in Spanish (#43) - Added pytest flake8 (#44) - Refactored the code (#45) - Improved testing (#46) - Improved scripts (#47) - Added tests (#50, #72) - Added GitHub actions (#54, #55, #56, #57) - Added support for simple fractions (#60)

New features: - Added feature to parse numbers in Ukrainian (#79)

0.2.1 (2020-08-25)

Fix tokenization bug - Hindi

0.2.0 (2020-08-18)

Ordinal Number Support

0.1.0 (2020-07-30)

Initial release.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

number-parser-0.3.2.tar.gz (42.4 kB view details)

Uploaded Source

Built Distribution

number_parser-0.3.2-py2.py3-none-any.whl (51.3 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file number-parser-0.3.2.tar.gz.

File metadata

  • Download URL: number-parser-0.3.2.tar.gz
  • Upload date:
  • Size: 42.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.2

File hashes

Hashes for number-parser-0.3.2.tar.gz
Algorithm Hash digest
SHA256 7650e91afd46dec2f40396f62c2babc84645679a1ca8ecdb5292bdd252685d6a
MD5 9e23e5340bfcdedb191dd800eab6e20f
BLAKE2b-256 28c11a3ea159327b442d2202fda38845124a51a3abe11637cbd3111479fd815f

See more details on using hashes here.

File details

Details for the file number_parser-0.3.2-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for number_parser-0.3.2-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 beb20ce4c66d8e8f34dde1512a21f7c2a2aa3b068c60e230c4388d548ef3e3ca
MD5 32db4a93b0f698407b9be448f335e190
BLAKE2b-256 c73385138cd4a8561d7f1722490612d30cbfab170fd6d38021b193072eecc1db

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page