You're Invited:Meet the Socket Team at BlackHat and DEF CON in Las Vegas, Aug 4-6.RSVP
Socket
Book a DemoInstallSign in
Socket

regenx

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

regenx

yet another regex based text generation library

0.0.2
Source
pipPyPI
Maintainers
1

regenx

regenx is yet another regex based text generation library

PyPI - Version PyPI - Python Version

Table of Contents

Installation

pip install regenx

Features

This library currently supports generating all the possible strings that would match the given regex.

Supported syntax

. character (it will include all ascii printable characters) \ escape sequences all the special ones are ascii only the negative (uppercase) special escapes will include all ascii printable characters minus the specified ones

  • \d digits, same as [0-9]
  • \D non digits, same as [^0-9]
  • \s whitespace, same as [ \t\n\r\f\v]
  • \S non whitespace, same as [^ \t\n\r\f\v]
  • \w same as [a-zA-Z0-9_]
  • \W same as [^a-zA-Z0-9_]
  • \a ascii bell (BEL)
  • \b ascii backspace (BS)
  • \f ascii formfeed (FF)
  • \n ascii linefeed (LF)
  • \r ascii carriage return (CR)
  • \t ascii horizontal tab (TAB)
  • \v ascii vertical tab (VT)

numerical escape sequences:
the positive integer number given will be converted to the associated character

  • \o octal escape sequence, note: must be 3 digits long, this is different to the standard \<octal_number> to distinguish it from back references
  • \x 8bit hexadecimal escape sequence, note: must be 2 digits long
  • \u 16bit hexadecimal escape sequence, note: must be 4 digits long
  • \U 32bit hexadecimal escape sequence, note: must be 8 digits long

all other escaped characters will be treated as themselves

[] sets

  • the ^ modifier will include all ascii printable characters minus the specified ones
  • the - range modifier will include all characters with integer values between and including the ranges specified

() groups (modifiers are not yet supported)

| or sequences, inside or outside groups

Limitations

some features of regex do not make sense and will be ignored:

  • boundary assertion character (^ and $)
  • all the beginning or end of word escape sequences
  • greedy and non greedy modifiers for counts (+ and ?),
    maybe in future they could be used to set the generation order

limits by design, they may be supported if there is a reason to

  • unlimited counts such as * +
    this is because i feel that generating an infinitely long string is mostly useless and it would complicate the parsing process more abstract and less easily serializable,
    you are welcome to convince me otherwise.
  • group modifiers
    is there a point for those for our scope?

See also

inspired by: janstarke/rexgen
friends: Buba98/regex_enumerator (yes we know each other)

License

regenx is distributed under the terms of the MIT license.

Keywords

data-generation

FAQs

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts