Research
Security News
Malicious npm Package Targets Solana Developers and Hijacks Funds
A malicious npm package targets Solana developers, rerouting funds in 2% of transactions to a hardcoded address.
.. image:: https://github.com/Parquery/lexery/actions/workflows/ci.yml/badge.svg :target: https://github.com/Parquery/lexery/actions/workflows/ci.yml :alt: Continuous integration
.. image:: https://coveralls.io/repos/github/Parquery/lexery/badge.svg?branch=master :target: https://coveralls.io/github/Parquery/lexery?branch=master :alt: Coverage
.. image:: https://badge.fury.io/py/lexery.svg :target: https://pypi.org/project/lexery/ :alt: PyPI - version
.. image:: https://img.shields.io/pypi/pyversions/lexery.svg :target: https://pypi.org/project/lexery/ :alt: PyPI - Python Version
A simple lexer based on regular expressions.
Inspired by https://eli.thegreenplace.net/2013/06/25/regex-based-lexical-analysis-in-python-and-javascript
You define the lexing rules and lexery matches them iteratively as a look-up:
.. code-block:: python
>>> import lexery
>>> import re
>>> text = 'crop \t ( 20, 30, 40, 10 ) ;'
>>>
>>> lexer = lexery.Lexer(
... rules=[
... lexery.Rule(identifier='identifier',
... pattern=re.compile(r'[a-zA-Z_][a-zA-Z_]*')),
... lexery.Rule(identifier='lpar', pattern=re.compile(r'\(')),
... lexery.Rule(identifier='number', pattern=re.compile(r'[1-9][0-9]*')),
... lexery.Rule(identifier='rpar', pattern=re.compile(r'\)')),
... lexery.Rule(identifier='comma', pattern=re.compile(r',')),
... lexery.Rule(identifier='semi', pattern=re.compile(r';'))
... ],
... skip_whitespace=True)
>>> tokens = lexer.lex(text=text)
>>> assert tokens == [[
... lexery.Token('identifier', 'crop', 0, 0),
... lexery.Token('lpar', '(', 9, 0),
... lexery.Token('number', '20', 11, 0),
... lexery.Token('comma', ',', 13, 0),
... lexery.Token('number', '30', 15, 0),
... lexery.Token('comma', ',', 17, 0),
... lexery.Token('number', '40', 19, 0),
... lexery.Token('comma', ',', 21, 0),
... lexery.Token('number', '10', 23, 0),
... lexery.Token('rpar', ')', 26, 0),
... lexery.Token('semi', ';', 28, 0)]]
Mind that if a part of the text can not be matched, a lexery.Error
is raised:
.. code-block:: python
>>> import lexery
>>> import re
>>> text = 'some-identifier ( 23 )'
>>>
>>> lexer = lexery.Lexer(
... rules=[
... lexery.Rule(identifier='identifier', pattern=re.compile(r'[a-zA-Z_][a-zA-Z_]*')),
... lexery.Rule(identifier='number', pattern=re.compile(r'[1-9][0-9]*')),
... ],
... skip_whitespace=True)
>>> tokens = lexer.lex(text=text)
Traceback (most recent call last):
...
lexery.Error: Unmatched text at line 0 and position 4:
some-identifier ( 23 )
^
If you specify an unmatched_identifier
, all the unmatched characters are accumulated in tokens with that identifier:
.. code-block:: python
>>> import lexery
>>> import re
>>> text = 'some-identifier ( 23 )-'
>>>
>>> lexer = lexery.Lexer(
... rules=[
... lexery.Rule(identifier='identifier', pattern=re.compile(r'[a-zA-Z_][a-zA-Z_]*')),
... lexery.Rule(identifier='number', pattern=re.compile(r'[1-9][0-9]*')),
... ],
... skip_whitespace=True,
... unmatched_identifier='unmatched')
>>> tokens = lexer.lex(text=text)
>>> assert tokens == [[
... lexery.Token('identifier', 'some', 0, 0),
... lexery.Token('unmatched', '-', 4, 0),
... lexery.Token('identifier', 'identifier', 5, 0),
... lexery.Token('unmatched', '(', 16, 0),
... lexery.Token('number', '23', 18, 0),
... lexery.Token('unmatched', ')-', 21, 0)]]
.. code-block:: bash
pip3 install lexery
Check out the repository.
In the repository root, create the virtual environment:
.. code-block:: bash
python3 -m venv venv3
.. code-block:: bash
source venv3/bin/activate
.. code-block:: bash
pip3 install -e .[dev]
We provide a set of pre-commit checks that run unit tests, lint and check code for formatting.
Namely, we use:
yapf <https://github.com/google/yapf>
_ to check the formatting.pydocstyle <https://github.com/PyCQA/pydocstyle>
_.mypy <http://mypy-lang.org/>
_.pylint <https://www.pylint.org/>
_.Run the pre-commit checks locally from an activated virtual environment with development dependencies:
.. code-block:: bash
./precommit.py
.. code-block:: bash
./precommit.py --overwrite
We follow Semantic Versioning <http://semver.org/spec/v1.0.0.html>
_. The version X.Y.Z indicates:
FAQs
A simple lexer based on regular expressions
We found that lexery demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 2 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
Security News
A malicious npm package targets Solana developers, rerouting funds in 2% of transactions to a hardcoded address.
Security News
Research
Socket researchers have discovered malicious npm packages targeting crypto developers, stealing credentials and wallet data using spyware delivered through typosquats of popular cryptographic libraries.
Security News
Socket's package search now displays weekly downloads for npm packages, helping developers quickly assess popularity and make more informed decisions.