Regular expression meet Turkish language. You can use character classes and meta characters in your regexps without worrying about Turkish support.
This is implemented as an "external DSL" in Ruby; that is (like SQL for example), a "program" in a Ruby string is passed into some kind of parser/interpreter method. In this case, it is possible to use the result "as is" or to convert to an ordinary Ruby regular expression. Though this mini-language was conceived and implemented "for Ruby, using Ruby," in principle there is no reason it might not also be implemented in other languages such as Python or Perl. Development on this project began in mid-July 2013. As such, it is still an immature project. Syntax and semantics may change. Feel free to offer comments or suggestions.
A simple text conversion utility using regular expressions for searching and replacing multiple strings across multiple files, for conversion projects.
'Regular expression syntaxis is hard to remember. Easyregexp wants to make your life easier.'
Espressione is a Ruby community-driven common regular expression patterns gem
== ICU4R - ICU Unicode bindings for Ruby ICU4R is an attempt to provide better Unicode support for Ruby, where it lacks for a long time. Current code is mostly rewritten string.c from Ruby 1.8.3. ICU4R is Ruby C-extension binding for ICU library[1] and provides following classes and functionality: * UString: - String-like class with internal UTF16 storage; - UCA rules for UString comparisons (<=>, casecmp); - encoding(codepage) conversion; \ - Unicode normalization; - transliteration, also rule-based; Bunch of locale-sensitive functions: - upcase/downcase; - string collation; \ - string search; - iterators over text line/word/char/sentence breaks; \ - message formatting (number/currency/string/time); - date and number parsing. * URegexp - unicode regular expressions. * UResourceBundle - access to resource bundles, including ICU locale data. * UCalendar - date manipulation and timezone info. * UConverter - codepage conversions API * UCollator - locale-sensitive string comparison == Install and usage > ruby extconf.rb > make && make check > make install Now, in your scripts just require 'icu4r'. To create RDoc, run > sh tools/doc.sh == Requirements To build and use ICU4R you will need GCC and ICU v3.4 libraries[2]. == Differences from Ruby String and Regexp classes === UString vs String 1. UString substring/index methods use UTF16 codeunit indexes, not code points. 2. UString supports most methods from String class. Missing methods are: capitalize, capitalize!, swapcase, swapcase! %, center, ljust, rjust chomp, chomp!, chop, chop! \ count, delete, delete!, squeeze, squeeze!, tr, tr!, tr_s, tr_s! crypt, intern, sum, unpack dump, each_byte, each_line hex, oct, to_i, to_sym reverse, reverse! succ, succ!, next, next!, upto 3. Instead of String#% method, UString#format is provided. See FORMATTING for short reference. 4. UStrings can be created via String.to_u(encoding='utf8') or global u(str,[encoding='utf8']) calls. Note that +encoding+ parameter must be value of String class. 5. There's difference between character grapheme, codepoint and codeunit. See UNICODE reports for gory details, but in short: locale dependent notion of character can be presented using more than one codepoint - base letter and combining (accents) (also possible more than one!), and each codepoint can require more than one codeunit to store (for UTF8 codeunit size is 8bit, though \ some codepoints require up to 4bytes). So, UString has normalization and locale dependent break iterators. 6. Currently UString doesn't include Enumerable module. 7. UString index/[] methods which accept URegexp, throw exception if Regexp passed. 8. UString#<=>, UString#casecmp use UCA rules. === URegexp UString uses ICU regexp library. Pattern syntax is described in [./docs/UNICODE_REGEXPS] and ICU docs. There are some differences between processing in Ruby Regexp and URegexp: 1. When UString#sub, UString#gsub are called with block, special vars ($~, $&, $1, ...) aren't set, as their values are processed through deep ruby core code. Instead, block receives UMatch object, which is essentially immutable array of matching groups: "test".u.gsub(ure("(e)(.)")) do |match| \ puts match[0] # => 'es' <--> $& puts match[1] # => 'e' \ <--> $1 puts match[2] # => 's' <--> $2 end 2. In URegexp search pattern backreferences are in form \n (\1, \2, ...), in replacement string - in form $1, $2, ... NOTE: URegexp considers char to be a digit NOT ONLY ASCII (0x0030-0x0039), but any Unicode char, which has property Decimal digit number (Nd), e.g.: a = [?$, 0x1D7D9].pack("U*").u * 2 puts a.inspect_names <U000024>DOLLAR SIGN <U01D7D9>MATHEMATICAL DOUBLE-STRUCK DIGIT ONE <U000024>DOLLAR SIGN <U01D7D9>MATHEMATICAL DOUBLE-STRUCK DIGIT ONE puts "abracadabra".u.gsub(/(b)/.U, a) abbracadabbra \ 3. One can create URegexp using global Kernel#ure function, Regexp#U, Regexp#to_u, or from UString using URegexp.new, e.g: /pattern/.U =~ "string".u 4. There are differences about Regexp and URegexp multiline matching options: t = "text\ntest" # ^,$ handling : URegexp multiline <-> Ruby default t.u =~ ure('^\w+$', URegexp::MULTILINE) => #<UMatch:0xf6f7de04 @ranges=[0..3], @cg=[\u0074\u0065\u0078\u0074]> t =~ /^\w+$/ => 0 # . matches \n : URegexp DOTALL <-> /m t.u =~ ure('.+test', URegexp::DOTALL) \ => #<UMatch:0xf6fa4d88 ... t.u =~ /.+test/m 5. UMatch.range(idx) returns range for capturing group idx. This range is in codeunits. === References 1. ICU Official Homepage http://ibm.com/software/globalization/icu/ 2. ICU downloads \ http://ibm.com/software/globalization/icu/downloads.jsp 3. ICU Home Page http://icu.sf.net 4. Unicode Home Page http://www.unicode.org ==== BUGS, DOCS, TO DO The code is slow and inefficient yet, is still highly experimental, so can have many security and memory leaks, bugs, inconsistent documentation, incomplete test suite. Use it at your own risk. Bug reports and feature requests are welcome :) === Copying This extension module is copyrighted free software by Nikolai Lugovoi. You can redistribute it and/or modify it under the terms of MIT License. Nikolai Lugovoi <meadow.nnick@gmail.com>
Regular expressions bundled into a single place, so you dont have to think anymore about what email regexp to use.
cnuregexp allows tags to be placed inside a regex which function as labels for the matches. The matches within the MatchData object can then be accessed like a hash with the tag name as the key. cnuregexp also provides a greedy match which will return an array of all matches rather than just the first match. cnuregexp can also extract various data from an xml tag with the Regexp.xml_tag method. It uses Regexps to get the tag name, the attributes and their values, the tag content, and any other relevant data from an xml string. Lastly, cnuregexp allows commonly used regular expressions to be stored in a config file(lib/cnuregexp_config.yml) and accessed with Regexp.regular_expression_name notation eg. Regexp.ssn, Regexp.email_address. cnuregexp comes preloaded with a few common regular expressions which are located in lib/cnuregexp_config.yml.
Extends the Regexp class with a concat method, which lets you concatenate any number of regular expression objects together.
Regextest generates sample string that matches with regular expression. Unlike similar tools, it recognizes anchors, charactor classes and other advanced notation of ruby regex. Target users are programmers or students for debugging/learning regular expression. You can use [sample application](http://goo.gl/5miiF4) without installation.
This is a parsing library and language specifier. It uses packrat parsing, as opposed to LL(k) or LR(k) parsing. Packrat parsing uses memoization in a recursive decent parser. By storing the production results from each significant point it speeds up the parse. PEG is a formalized grammar specification optimized for packrat parsing. Peggy also allows user to specfy their grammar in pure Ruby as methods or using a Builder. And the default Peggy grammar is a varitaion on PEG, with support for full regular expressions and for simplifed grammars which automatically ignore a set of productions.
A regular expression matching Gitmoji (a subset of Unicode Emoji) symbols
Use regular expressions with 'regexhelper.email_regex' etc.
Migemo is a tool for Japanese incremental search. It makes Japanese character regular expression from alphabet and optimize them.
Ivy (http://www.tls.cena.fr/products/ivy/) is a simple protocol and a set of open-source (LGPL) libraries and programs that allows applications to broadcast information through text messages, with a subscription mechanism based on regular expressions.
a Ruby script that allows you to rename a group of files via regular expression
Regular expressions bundled into a single place, so you dont have to think anymore about what email regexp to use.
A gem providing pre-made and tested typical regular expressions for applications.
TTYCoke enables coloring on ANSI terminals based on regular expressions.
regular expression library
Rrrex is a new syntax for regular expressions. Less compact, but human-readable. By regular humans.
A simple lookup sieve using regular expressions
Fluent output plugin for reforming a record using multiple named capture regular expressions
Generates scraped layouts, allowing content to be inserted via regular expressions.
Believe it or not this is a ECMA-262 edition 5.1 -compatible Regular Expression parser and evaluator, written in Ruby.
dirmangle is a tool for copying, moving, or creating symlinks to sets of files using regular expressions
Filter and rename multiple files in a directory or subdirectories with regular expressions.
Overrides the regular expression used to parse gem version strings to allow for pre release version strings in older versions of rubygems
PIC is a pattern matching library, like regular expressions, based on Cobol edited pictures. It's advantage over regular expressions is it more concise syntax which geared specifically toward validation of user input.
Bindings for libfa, a library to manipulate finite automata. Automata are constructed from regular expressions, using extended POSIX syntax, and make it possible to compute interesting things like the intersection of two regular expressions (all strings matched by both), or the complement of a regular expression (all strings _not_ matched by the regular expression). It is possible to convert from regular expression (compile) to finite automaton, and from finite automaton to regular expression (as_regexp)
Abstraction of common regular expressions(regex) in Ruby
Actively monitors a log file for new entries which can trigger an event using a regular expression
A gem for parsing bbcode that doesn't rely on regular expressions
Extends the Regexp class with a concat method, which lets you concatenate any number of regular expression objects together.
Compact and flexible syntax to generate regular expressions
Simple, Regular Expression powered template engine. Intelligently handles HTML input.
kleene is a library for building regular expression recognition automata - nfas, dfas, and some specialty structures.
This Jekyll plugin provides 3 filters that return portions of a multiline string: from, to and until. Regular expression is used to specify matches; the simplest regular expression is a string.
Enhanced regular expressions allow for a boolean inverted/not inverted association to be attached to any regular expression, as well as easy modification of Regexp options (e, i, m).
A multi-line search and replace utility that uses Ruby regular expressions for searching and allows back references to captured groups from the pattern to appear in the replacement text.
`acoc` is a regular expression based colour formatter for programs that display output on the command-line. It works as a wrapper around the target program, executing it and capturing the stdout stream. Optionally, stderr can be redirected to stdout, so that it, too, can be manipulated. `acoc` then applies matching rules to patterns in the output and applies colour sets to those matches.
TOY Regular Expression Engine in Pure Ruby
This gem is for beautifiying ruby code. It uses tradional regular expression parsing.
Wonderboom is a Ruby library for extracting information from unstructured documents via regular expressions.
Module providing a method to compile regular expressions from templates.
Use regular expressions to route missing methods
A regular expression generator with a neat DSL
Test your Regular Expressions locally
SimpleRegex provides a flexible syntax for constructing regular expressions by chaining Ruby method calls instead of deciphering cryptic syntax.
Library for generating random strings from regular expressions.