simplematch
Minimal, super readable string pattern matching for python.
import simplematch
simplematch.match("He* {planet}!", "Hello World!")
>>> {"planet": "World"}
simplematch.match("It* {temp:float}°C *", "It's -10.2°C outside!")
>>> {"temp": -10.2}
Installation
pip install simplematch
(Or just drop the simplematch.py
file in your project.)
Syntax
simplematch
has only two syntax elements:
- wildcard
*
- capture group
{name}
Capture groups can be named ({name}
), unnamed ({}
) and typed ({name:float}
).
The following types are available:
int
float
email
url
ipv4
ipv6
bitcoin
ssn
(social security number)ccard
(matches Visa, MasterCard, American Express, Diners Club, Discover, JCB)
For now, only named capture groups can be typed.
Then use one of these functions:
import simplematch
simplematch.match(pattern, string)
simplematch.test(pattern, string)
Or use a Matcher
object:
import simplematch as sm
matcher = sm.Matcher(pattern)
matcher.match(string)
matcher.test(string)
matcher.regex
Basic usage
import simplematch as sm
sm.match(
pattern="Invoice_*_{year}_{month}_{day}.pdf",
string="Invoice_RE2321_2021_01_15.pdf")
>>> {"year": "2021", "month": "01", "day": "15"}
sm.test("ABC-{value:int}", "ABC-13")
>>> True
Typed matches
import simplematch as sm
matcher = sm.Matcher("{year:int}-{month:int}: {value:float}")
matcher.match("2021-01: -12.786")
>>> {"year": 2021, "month": 1, "value": -12.786}
matcher.match("2021-AB: Hello")
>>> None
matcher.test("1234-01: 123.123")
>>> True
matcher.regex
>>> '^(?P<year>[+-]?[0-9]+)\\-(?P<month>[+-]?[0-9]+):\\ (?P<value>[+-]?(?:[0-9]*[.])?[0-9]+)$'
matcher.converters
>>> {'year': <class 'int'>, 'month': <class 'int'>, 'value': <class 'float'>}
Register your own types
You can register your own types to be available for the {name:type}
matching syntax
with the register_type
function.
simplematch.register_type(name, regex, converter=str)
name
is the name to use in the matching syntaxregex
is a regular expression to match your typeconverter
is a callable to convert a match (str
by default)
Example
Register a smiley
type to detect smileys (:)
, :(
, :/
) and getting their moods:
import simplematch as sm
def mood_convert(smiley):
moods = {
":)": "good",
":(": "bad",
":/": "sceptic",
}
return moods.get(smiley, "unknown")
sm.register_type("smiley", r":[\)\(\/]", mood_convert)
sm.match("I'm feeling {mood:smiley} *", "I'm feeling :) today!")
>>> {"mood": "good"}
CLI Command
You can also install simplematch
for use as a CLI command e.g. using pipx
.
pipx install simplematch
Usage
usage: simplematch [-h] [--regex] pattern [strings ...]
positional arguments:
pattern A matching pattern
strings The string to match
options:
-h, --help show this help message and exit
--regex Show the generated regular expression
Example
Extract a date from a specific file name:
simplematch "Invoice_*_{year}_{month}_{day}.pdf" "Invoice_RE2321_2021_01_15.pdf"
>>> {"year": "2021", "month": "01", "day": "15"}
Background
simplematch
aims to fill a gap between parsing with str.split()
and regular
expressions. It should be as simple as possible, fast and stable.
The simplematch
syntax is transpiled to regular expressions under the hood, so
matching performance should be just as good.
I hope you get some good use out of this!
Contributions
Contributions are welcome! Just submit a PR and maybe get in touch with me via email
before big changes.
License
MIT