
PyREmatch: REmatch bindings for Python
Python bindings for REmatch, an information extraction focused regex library that uses constant delay algorithms.
Installation
You can install the latest release version from PyPI:
pip install pyrematch
Or you can build from the source code:
git clone git@github.com:REmatchChile/REmatch.git
cd REmatch
pip install .
Usage
Here is an example that prints all the matches using the finditer
function.
import pyrematch as REmatch
document = "cperez@gmail.com\npvergara@ing.uc.cl\njuansoto@uc.cl"
pattern = r"@!domain{(\w+\.)+\w+}(\n|$)"
query = REmatch.reql(pattern)
for match in query.finditer(document):
print(match)
The Query
object contains also other useful methods. To get a single match, you can use:
query.findone(document)
To find all the matches, you can use:
query.findall(document)
To find a limited number of matches, you can use:
limit = 10
query.findmany(document, limit)
To check if a match exists, you can use:
query.check(document)
You can read more about this in the PyREmatch Tutorial.