Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

price-parser

Package Overview
Dependencies
Maintainers
2
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

price-parser

Extract price and currency from a raw string

  • 0.3.4
  • PyPI
  • Socket score

Maintainers
2

============ price-parser

.. image:: https://img.shields.io/pypi/v/price-parser.svg :target: https://pypi.python.org/pypi/price-parser :alt: PyPI Version

.. image:: https://img.shields.io/pypi/pyversions/price-parser.svg :target: https://pypi.python.org/pypi/price-parser :alt: Supported Python Versions

.. image:: https://travis-ci.org/scrapinghub/price-parser.svg?branch=master :target: https://travis-ci.org/scrapinghub/price-parser :alt: Build Status

.. image:: https://codecov.io/github/scrapinghub/price-parser/coverage.svg?branch=master :target: https://codecov.io/gh/scrapinghub/price-parser :alt: Coverage report

price-parser is a small library for extracting price and currency from raw text strings.

Features:

  • robust price amount and currency symbol extraction
  • zero-effort handling of thousand and decimal separators

The main use case is parsing prices extracted from web pages. For example, you can write a CSS/XPath selector which targets an element with a price, and then use this library for cleaning it up, instead of writing custom site-specific regex or Python code.

License is BSD 3-clause.

Installation

::

pip install price-parser

price-parser requires Python 3.6+.

Usage

Basic usage

from price_parser import Price price = Price.fromstring("22,90 €") price Price(amount=Decimal('22.90'), currency='€') price.amount # numeric price amount Decimal('22.90') price.currency # currency symbol, as appears in the string '€' price.amount_text # price amount, as appears in the string '22,90' price.amount_float # price amount as float, not Decimal 22.9

If you prefer, Price.fromstring has an alias price_parser.parse_price, they do the same:

from price_parser import parse_price parse_price("22,90 €") Price(amount=Decimal('22.90'), currency='€')

The library has extensive tests (900+ real-world examples of price strings). Some of the supported cases are described below.

Supported cases

Unclean price strings with various currencies are supported; thousand separators and decimal separators are handled:

Price.fromstring("Price: $119.00") Price(amount=Decimal('119.00'), currency='$')

Price.fromstring("15 130 Р") Price(amount=Decimal('15130'), currency='Р')

Price.fromstring("151,200 تومان") Price(amount=Decimal('151200'), currency='تومان')

Price.fromstring("Rp 1.550.000") Price(amount=Decimal('1550000'), currency='Rp')

Price.fromstring("Běžná cena 75 990,00 Kč") Price(amount=Decimal('75990.00'), currency='Kč')

Euro sign is used as a decimal separator in a wild:

Price.fromstring("1,235€ 99") Price(amount=Decimal('1235.99'), currency='€')

Price.fromstring("99 € 95 €") Price(amount=Decimal('99'), currency='€')

Price.fromstring("35€ 999") Price(amount=Decimal('35'), currency='€')

Some special cases are handled:

Price.fromstring("Free") Price(amount=Decimal('0'), currency=None)

When price or currency can't be extracted, corresponding attribute values are set to None:

Price.fromstring("") Price(amount=None, currency=None)

Price.fromstring("Foo") Price(amount=None, currency=None)

Price.fromstring("50% OFF") Price(amount=None, currency=None)

Price.fromstring("50") Price(amount=Decimal('50'), currency=None)

Price.fromstring("R$") Price(amount=None, currency='R$')

Currency hints

currency_hint argument allows to pass a text string which may (or may not) contain currency information. This feature is most useful for automated price extraction.

Price.fromstring("34.99", currency_hint="руб. (шт)") Price(amount=Decimal('34.99'), currency='руб.')

Note that currency mentioned in the main price string may be preferred over currency specified in currency_hint argument; it depends on currency symbols found there. If you know the correct currency, you can set it directly:

price = Price.fromstring("1 000") price.currency = 'EUR' price Price(amount=Decimal('1000'), currency='EUR')

Decimal separator

If you know which symbol is used as a decimal separator in the input string, pass that symbol in the decimal_separator argument to prevent price-parser from guessing the wrong decimal separator symbol.

Price.fromstring("Price: $140.600", decimal_separator=".") Price(amount=Decimal('140.600'), currency='$')

Price.fromstring("Price: $140.600", decimal_separator=",") Price(amount=Decimal('140600'), currency='$')

Contributing

Use tox_ to run tests with different Python versions::

tox

The command above also runs type checks; we use mypy.

.. _tox: https://tox.readthedocs.io

Changes

0.3.4 (2020-11-25)

0.3.3 (2020-02-05)

  • Fixed installation issue on some Windows machines.

0.3.2 (2020-01-28)

  • Improved Korean and Japanese currency detection.
  • Declare Python 3.8 support.

0.3.1 (2019-10-21)

  • Redundant $ signs are no longer returned as a part of currency, e.g. for SGD$ 100 currency would be SGD, not SGD$.

0.3.0 (2019-10-19)

  • New Price.fromstring argument decimal_separator allows to override decimal separator for the cases where it is known (i.e. disable decimal separator detection);
  • NTD and RBM unofficial currency names are added;
  • quantifiers in regular expressions are made non-greedy, which provides a small speedup;
  • test improvements.

0.2.4 (2019-07-03)

  • Declare price-parser as providing type annotations (pep-561). This enables better type checking for projects using price-parser.
  • improved test coverage

0.2.3 (2019-06-18)

  • Follow-up for 0.2.2 release: improved parsing of prices with 4+ digits after a decimal separator.

0.2.2 (2019-06-18)

  • Fixed parsing of prices with 4+ digits after a decimal separator.

0.2.1 (2019-04-19)

  • 23 additional currency symbols are added;
  • A$ alias for Australian Dollar is added.

0.2 (2019-04-12)

Added support for currencies replaced by euro.

0.1.1 (2019-04-12)

Minor packaging fixes.

0.1 (2019-04-12)

Initial release.

FAQs


Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc