Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

nz.co.gregs:regexi

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

nz.co.gregs:regexi

A fluent API to generate and use regular expressions

  • 4.7
  • Source
  • Maven
  • Socket score

Version published
Maintainers
1
Source

Regexi

A regular expression library for Java with Named Capture support

Why?

Regular expressions are very powerful but don't scale well.

What does that mean?

The power of the regex language trips up developers very quickly. For instance everyone can search for a number using regex and usually they'll be wrong. The simple and obvious pattern, /[0-9]*/, will incorrectly match ABC0345 but miss 0.89.

A better solution is /(([-+]?\\b([1-9]+\\d\*|0+(?!\\d)))((\\.){1}(\\d+))?){1}/ but 99% developers will recognise that as complete gibberish. What is it doing and why? No one knows, I certainly didn't when I realised I needed to extend an already large expression.

Regexi solves this by implementing all the parts of the regular expression language as part of a chaining class. So instead we would write:

  Regex.startingAnywhere()
  .anyCharacterIn("-+").onceOrNotAtAll()
  .wordBoundary()
  .beginOrGroup()
  .anyCharacterBetween('1', '9').atLeastOnce()
  .digit().zeroOrMore()
  .or().literal('0').oneOrMore().notFollowedBy(Regex.startingAnywhere().digit())
  .endOrGroup()
  .once()
  .toRegex();
  

Or even better

  Regex.startingAnywhere().number().once().toRegex();

Using that in actual Java:

Regex pattern = Regex
    .startingAnywhere()
    .number().once()
    .toRegex();
	
	
final String input = "it should find 0.89 correctly";
if(pattern.matchesWithinString(input)){
    System.out.println("Found it successfully");
    System.out.println("The number we found: "+ pattern.getAllMatches(input).get(0).getEntireMatch());
}

if(!pattern.matchesWithinString("and don't find ABC0345 because it's not a number")){
	    System.out.println("Didn't find it successfully");
}
		
  

Seems nice but I've spotted 5 errors in your regex already.

That's cool, please raise an issue :)

Nothing is ever perfect, so I've kept access to all the parts of the regex language available. You can use PartialRegex to create your own standard patterns and combine them as required.

final PartialRegex oneNumber = Regex.startingAnywhere().number().once();

Anything else I should know about?

Regexi also provides access to the more complicated features of Java's pattern matching easily. In particular

  • Named captures are hardly mentioned even in Java's documentation but are easily used and retreived with regexi:

    var regex = Regex.startingFromTheBeginning()
          .beginNamedCapture("interval").literalCaseInsensitive("interval").once().endNamedCapture()
          .space().once()
          .beginNamedCapture("value").numberIncludingScientificNotation().once().endNamedCapture()
          .space().once()
          .beginNamedCapture("unit").word().endNamedCapture()
          .endOfInput()
          .toRegex();
    var captured = regex.getAllMatches(input).get(0).getAllNamedCaptures();
    String value = captured.get("value");
    
  • RegexValueFinder streamlines named captures even further to return one value quickly

    RegexValueFinder regex
               = Regex.startingFromTheBeginning()
           .beginNamedCapture("interval").literalCaseInsensitive("interval").once().endNamedCapture()
           .space().once()
           .beginNamedCapture("value").numberIncludingScientificNotation().once().endNamedCapture()
           .space().once()
           .beginNamedCapture("unit").word().endNamedCapture()
           .endOfInput()
           .returnValueFor("value");
          
    String value = regex.getValueFor(input).orElse("");
    
  • RegexSplitter chops your input into chunks

    List<String> words = Regex.empty().literal(" ").toSplitter().splitToList(input);
    
  • RegexReplacement turns your regular expression into a find/replace using regex:

    String escapeBackslashes = Regex.empty().backslash().replaceWith().backslash().backslash().replaceAll(s);
    
  • There are methods for tricky characters like backslash to help you avoid using a regex instruction when you meant a letter

  • Similarly the literal(string) method automatically escapes regex instructions to avoid problems

Pronounciation

However works for you is fine, it varies from minute to minute for me.

I do recommend "Rej X I" though.

FAQs

Package last updated on 20 Nov 2023

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc