You're Invited:Meet the Socket Team at BlackHat and DEF CON in Las Vegas, Aug 4-6.RSVP
Socket
Book a DemoInstallSign in
Socket

trrex

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

trrex

Transform set of words to efficient regular expression

0.0.7
pipPyPI
Maintainers
1
trrex logo
 

Efficient string matching with regular expressions

This package includes a pure Python function that enables you to represent a set of strings as a regular expression. With this regular expression, you can perform various operations, such as replacing, extracting and matching keywords. The name of the package comes from the internal trie used to build the regular expression (TRie to REgeX)

Install trrex

Use pip,

pip install trrex

Usage

import trrex as tx
import re

pattern = tx.make(['baby', 'bat', 'bad'])
hits = re.findall(pattern, 'The baby was scared by the bad bat.')
# hits = ['baby', 'bat', 'bad']

pandas

import trrex as tx
import pandas as pd

frame = pd.DataFrame({
    "txt": ["The baby", "The bat"]
})
pattern = tx.make(['baby', 'bat', 'bad'], prefix=r"\b(", suffix=r")\b") # need to specify capturing groups
frame["match"] = frame["txt"].str.extract(pattern)
hits = frame["match"].tolist()
print(hits)
# hits = ['baby', 'bad']

Why use trrex?

  • trrex builds a better regex pattern, than the simple regex union, therefore searching (and replacing) strings is about 300 times faster than a regex union pattern, and about 2.5 times faster than FlashText algorithm. See below for a performance comparison:

Performance comparison

  • Plays well with others, can be integrated easily with pandas, spacy and any other regex engine. See the documentation for examples.
  • Pure Python, no other dependencies

Issues

If you have any issues with this repository, please don't hesitate to raise them. It is actively maintained, and we will do our best to help you.

Acknowledgments

This project is based on the following resources:

Liked the work?

If you've found this repository helpful, why not give it a star? It's an easy way to show your appreciation and support for the project. Plus, it helps others discover it too!

FAQs

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts