What is regenerate-unicode-properties?
The regenerate-unicode-properties package is a tool for generating JavaScript regular expressions based on Unicode properties. It is used to create sets of Unicode symbols that match certain criteria, such as belonging to a particular category, script, or block. This can be useful for text processing tasks that require matching characters with specific Unicode attributes.
What are regenerate-unicode-properties's main functionalities?
Generating regex sets for Unicode categories
This feature allows you to generate regular expressions that match any character from a specific Unicode category, such as letters (category 'L').
const regenerate = require('regenerate-unicode-properties');
const regex = regenerate().addCategory('L').toRegExp();
// regex will match any Unicode letter
Generating regex sets for Unicode scripts
This feature enables the creation of regular expressions that match characters from a specific Unicode script, such as Hiragana.
const regenerate = require('regenerate-unicode-properties');
const regex = regenerate().addScript('Hiragana').toRegExp();
// regex will match any character in the Hiragana script
Generating regex sets for Unicode blocks
With this feature, you can generate regular expressions that match characters within a specific Unicode block, like the Basic Latin block.
const regenerate = require('regenerate-unicode-properties');
const regex = regenerate().addBlock('Basic_Latin').toRegExp();
// regex will match any character in the Basic Latin block
Other packages similar to regenerate-unicode-properties
unicode-regex
The unicode-regex package allows you to create regular expressions based on Unicode properties, similar to regenerate-unicode-properties. It provides a higher-level API for composing regular expressions using Unicode property escapes.
xregexp
XRegExp is an extended JavaScript regex library that includes support for additional syntax and features, such as named capture groups and Unicode property escapes. It offers similar functionality for matching Unicode properties but also includes a wide range of other enhancements to regular expressions.
regexpu-core
regexpu-core is a library that compiles Unicode-aware regular expressions to ES5. It transforms Unicode property escapes and other Unicode-related regex features to work in older environments. It provides similar Unicode property matching capabilities but focuses on compatibility with non-ES2015 environments.
regenerate-unicode-properties
regenerate-unicode-properties is a collection of Regenerate sets for various Unicode properties.
Installation
To use regenerate-unicode-properties programmatically, install it as a dependency via npm:
$ npm install regenerate-unicode-properties
Usage
To get a map of supported properties and their values:
const properties = require('regenerate-unicode-properties');
To get a specific Regenerate set:
const Lu = require('regenerate-unicode-properties/General_Category/Uppercase_Letter.js');
const Greek = require('regenerate-unicode-properties/Script_Extensions/Greek.js');
To get the Unicode version the data was based on:
const unicodeVersion = require('regenerate-unicode-properties/unicode-version.js');
Author
License
regenerate-unicode-properties is available under the MIT license.