Skip to main content

Extract price and currency from a raw string

Project description

PyPI Version Supported Python Versions Build Status Coverage report

price-parser is a small library for extracting price and currency from raw text strings.

Features:

  • robust price amount and currency symbol extraction

  • zero-effort handling of thousand and decimal separators

The main use case is parsing prices extracted from web pages. For example, you can write a CSS/XPath selector which targets an element with a price, and then use this library for cleaning it up, instead of writing custom site-specific regex or Python code.

License is BSD 3-clause.

Installation

pip install price-parser

price-parser requires Python 3.6+.

Usage

Basic usage

>>> from price_parser import Price
>>> price = Price.fromstring("22,90 €")
>>> price
Price(amount=Decimal('22.90'), currency='€')
>>> price.amount       # numeric price amount
Decimal('22.90')
>>> price.currency     # currency symbol, as appears in the string
'€'
>>> price.amount_text  # price amount, as appears in the string
'22,90'
>>> price.amount_float # price amount as float, not Decimal
22.9

If you prefer, Price.fromstring has an alias price_parser.parse_price, they do the same:

>>> from price_parser import parse_price
>>> parse_price("22,90 €")
Price(amount=Decimal('22.90'), currency='€')

The library has extensive tests (900+ real-world examples of price strings). Some of the supported cases are described below.

Supported cases

Unclean price strings with various currencies are supported; thousand separators and decimal separators are handled:

>>> Price.fromstring("Price: $119.00")
Price(amount=Decimal('119.00'), currency='$')
>>> Price.fromstring("15 130 Р")
Price(amount=Decimal('15130'), currency='Р')
>>> Price.fromstring("151,200 تومان")
Price(amount=Decimal('151200'), currency='تومان')
>>> Price.fromstring("Rp 1.550.000")
Price(amount=Decimal('1550000'), currency='Rp')
>>> Price.fromstring("Běžná cena 75 990,00 Kč")
Price(amount=Decimal('75990.00'), currency='Kč')

Euro sign is used as a decimal separator in a wild:

>>> Price.fromstring("1,235€ 99")
Price(amount=Decimal('1235.99'), currency='€')
>>> Price.fromstring("99 € 95 €")
Price(amount=Decimal('99'), currency='€')
>>> Price.fromstring("35€ 999")
Price(amount=Decimal('35'), currency='€')

Some special cases are handled:

>>> Price.fromstring("Free")
Price(amount=Decimal('0'), currency=None)

When price or currency can’t be extracted, corresponding attribute values are set to None:

>>> Price.fromstring("")
Price(amount=None, currency=None)
>>> Price.fromstring("Foo")
Price(amount=None, currency=None)
>>> Price.fromstring("50% OFF")
Price(amount=None, currency=None)
>>> Price.fromstring("50")
Price(amount=Decimal('50'), currency=None)
>>> Price.fromstring("R$")
Price(amount=None, currency='R$')

Currency hints

currency_hint argument allows to pass a text string which may (or may not) contain currency information. This feature is most useful for automated price extraction.

>>> Price.fromstring("34.99", currency_hint="руб. (шт)")
Price(amount=Decimal('34.99'), currency='руб.')

Note that currency mentioned in the main price string may be preferred over currency specified in currency_hint argument; it depends on currency symbols found there. If you know the correct currency, you can set it directly:

>>> price = Price.fromstring("1 000")
>>> price.currency = 'EUR'
>>> price
Price(amount=Decimal('1000'), currency='EUR')

Contributing

Use tox to run tests with different Python versions:

tox

The command above also runs type checks; we use mypy.

Changes

0.2.3 (2019-06-18)

  • Folow-up for 0.2.2 release: improved parsing of prices with 4+ digits after a decimal separator.

0.2.2 (2019-06-18)

  • Fixed parsing of prices with 4+ digits after a decimal separator.

0.2.1 (2019-04-19)

  • 23 additional currency symbols are added;

  • A$ alias for Australian Dollar is added.

0.2 (2019-04-12)

Added support for currencies replaced by euro.

0.1.1 (2019-04-12)

Minor packaging fixes.

0.1 (2019-04-12)

Initial release.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

price-parser-0.2.3.tar.gz (31.8 kB view details)

Uploaded Source

Built Distribution

price_parser-0.2.3-py2.py3-none-any.whl (14.1 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file price-parser-0.2.3.tar.gz.

File metadata

  • Download URL: price-parser-0.2.3.tar.gz
  • Upload date:
  • Size: 31.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.6.3

File hashes

Hashes for price-parser-0.2.3.tar.gz
Algorithm Hash digest
SHA256 9b16a14326d22d2fb71d0e0c2b0b41b817b4e6ec0a1a9281537999a98f997c1d
MD5 74a8dc7dc2a6d07856039d3e8f53f6bd
BLAKE2b-256 4498804e859ba29ce85cc16ceff58c499e9f6c853464ae318ce647ec0c93dce3

See more details on using hashes here.

Provenance

File details

Details for the file price_parser-0.2.3-py2.py3-none-any.whl.

File metadata

  • Download URL: price_parser-0.2.3-py2.py3-none-any.whl
  • Upload date:
  • Size: 14.1 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.6.3

File hashes

Hashes for price_parser-0.2.3-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 d0055d9f9775102668bb47499eaa5a50f512523a1349d6570f279684ad35da12
MD5 671abd75e4e28d2e05a47420823ed5ea
BLAKE2b-256 fc863e86a95f02ce5ca0bf387034eeb99d8c889c3230b17eedd1daaa9e2958a3

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page