Socket
Socket
Sign inDemoInstall

characterset

Package Overview
Dependencies
0
Maintainers
1
Versions
9
Alerts
File Explorer

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

    characterset

A library for working with Unicode character sets


Version published
Weekly downloads
881
increased by8.9%
Maintainers
1
Install size
37.8 kB
Created
Weekly downloads
 

Readme

Source

CharacterSet

CharacterSet is a library for creating and manipulating Unicode character sets in JavaScript. Its main purpose is to help in building regular expressions for validation and matching. It fully supports all Unicode characters and correctly handles surrogate pairs in JavaScript strings and regular expressions.

Installation

If you are using Node.js you can install it using npm:

$ npm install characterset

If you want to use CharacterSet in the browser, use the global CharacterSet constructor or include CharacterSet as an AMD module.

API

The constructor takes a single input value, which can either be a number, a string or a range. A range is an array of numbers or number pairs.

// Creates a character set with a single code point for [97]
var cs = new CharacterSet(97);

// Creates a character set for the code points [97, 98, 99]
var cs = new CharacterSet('abc');

// Creates a character set for the code points [97, 98, 99]
var cs = new CharacterSet([97, 98, 99]);

// Creates a character set for the code points [97, 98, 99] using a range
var cs = new CharacterSet([[97, 99]]);

// Combines pairs and numbers in ranges for [0, 97, 98, 99]
var cs = new CharacterSet([48, [97, 99]]);

Or you can use the parseUnicodeRange method to return a CharacterSet instance from a comma-delimited unicode range string.

// Creates a character set for the code points [34, 35]
var cs = CharacterSet.parseUnicodeRange('u+23,u+22');

// Creates a character set for the code points [34, 35, 36, 37]
var cs = CharacterSet.parseUnicodeRange('u+22-25');

Once you have an instance of CharacterSet you can use the following methods on it:

getSize()
Returns the number of code points in this set.
toArray()
Returns all code points in this set as a sorted array.
toRange()
Returns all code points in this set as a range (i.e. compressed.)
isEmpty()
Returns true if this set is empty.
add(codePoint, ...)
Adds the given code point(s) to this set.
remove(codePoint, ...)
Removes the given code point(s) from this set.
contains(codePoint)
Returns true if the given code point is in this set.
equals(other)
Returns true if the `other` set is equal to this set.
union(other)
Returns a new character set from the combined code points from `other` and this character set.
intersect(other)
Returns a new character set containing only the code points `other` and this character set have in common.
difference(other)
Returns a new character set containing only the code points from this that are not in `other`.
subset(other)
Returns true if this character set is a subset of `other`.
toRegExp()
Returns a RegExp matching the code points in this character set
toString()
Returns a string representation of this character set.
toHexString()
Returns a hex string representation of this character set.
toHexRangeString()
Returns a hex string representation with ranges of this character set.

License

CharacterSet is licensed under the three clause BSD license (see BSD.txt.)

Keywords

FAQs

Last updated on 15 Aug 2022

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc