Socket
Socket
Sign inDemoInstall

utfstring

Package Overview
Dependencies
Maintainers
1
Versions
15
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

utfstring - npm Package Compare versions

Comparing version 2.1.0 to 3.0.0

25

package.json
{
"name": "utfstring",
"version": "2.1.0",
"version": "3.0.0",
"description": "UTF-safe string operations",

@@ -15,4 +15,4 @@ "repository": {

],
"main": "dist/utfstring.js",
"types": "dist/utfstring.d.ts",
"main": "dist/index.js",
"types": "dist/index.d.ts",
"files": [

@@ -28,14 +28,17 @@ "/dist"

"devDependencies": {
"jasmine-node": "3.0.0",
"@types/mocha": "9.1.1",
"expect": "29.0.3",
"mocha": "10.0.0",
"prettier": "2.7.1",
"rimraf": "3.0.2",
"typescript": "4.7.4",
"utfstring": "./"
"ts-node": "10.9.1",
"typescript": "4.8.3",
"webpack": "5.74.0",
"webpack-cli": "4.10.0"
},
"scripts": {
"clean": "rimraf dist",
"build": "npx tsc",
"clean-build": "npm run clean && npm run build",
"test": "npx jasmine-node test"
"format": "prettier --check .",
"build": "rimraf dist browser && tsc && webpack",
"test": "mocha -r ts-node/register -reporter min 'test/**/*.ts'"
}
}

@@ -45,11 +45,100 @@ [![NPM Version](https://img.shields.io/npm/v/utfstring?color=33cd56&logo=npm)](https://www.npmjs.com/package/utfstring)

```javascript
var UtfString = require('utfstring');
import { UtfString } from "utfstring";
var safeString = new UtfString("𤔣");
console.log(safeString.length); // 1
```
In the browser, `UtfString` will be available on `window` after you import the Javascript file from the `dist` folder.
In the browser, `UtfString` will be available on `window` after you import the Javascript file from the `browser` folder:
```html
<script src="utfstring.js"></script>
<script>
var safeString = new UtfString("𤔣");
console.log(safeString.length); // 1
</script>
```
## Usage
UtfString currently supports the following string operations:
### UtfString object methods
UtfString is a class that you can use to create UTF-safe string objects. These objects currently support the following operations:
* `constructor(String str)` - Creates a new UTF-safe string object.
* `*[Symbol.iterator]` - Allows you to iterate over the characters of the string using a `for-of` loop.
* `charAt(Integer index)` - Returns the character at the given index from the string.
* `charCodeAt(Integer index)` - Returns the Unicode codepoint at the given index.
* `codePointAt(Integer index)` - Same as `charCodeAt`.
* `concat(Array arr)` - Creates a new UTF-safe string object by concatenating the given strings.
* `endsWith(String str, [Integer endPos])` - Determines whether the string ends with the characters of a specified search string.
* `equals(String str)` - Checks if the given string equals the string.
* `findByteIndex(Integer charIndex)` - Finds the byte index for the given character index in the string. Note: a "byte index" is really a "JavaScript string index", not a true byte offset. Use this function to convert a UTF character boundary to a JavaScript string index.
* `findCharIndex(Integer byteIndex)` - Finds the character index for the given byte index. Note: a "byte index" is really a "JavaSciprt string index", not a true byte offset. Use this function to convert a JavaScript string index to (the closest) UTF character boundary.
* `includes(String str)` - Checks if the search value is within the string.
* `indexOf(String searchValue, [Integer start])` - Finds the first instance of the search value within the string. Starts at an optional offset.
* `lastIndexOf(String searchValue, [Integer start])` - Finds the last instance of the search value within the string. Starts searching backwards at an optional offset, which can be negative.
* `length` - Getter that returns the number of logical characters within the string object.
* `match(String matcher)` - Matches a string or an object that supports being matched against, and returns an array containing the results of that search, or null if no matches are found.
* `padEnd(Integer targetLength, [String padString])` - Creates a new string by padding the string with a given string (repeated, if needed) so that the resulting string reaches a given length. The padding is applied at the end of the string.
* `padStart(Integer targetLength, [String padString])` - Creates a new string by padding the string with a given string (repeated, if needed) so that the resulting string reaches a given length. The padding is applied at the start of the string.
* `repeat(Integer count)` - Returns a new string which contains the specified number of copies of the string on which it was called.
* `replace(String pattern, String replacement)` - Creates a new UTF-safe string object with one, some, or all matches of a pattern replaced by a replacement.
* `slice(Integer start, Integer finish)` - Returns the characters between the two given indices in the string.
* `split(String seperator, Integer limit)` - Splits a string into substrings using the specified separator and return them as an array.
* `startsWith(String str, [Integer startPos])` - Determines whether the string starts with the characters of a specified search string.
* `substr(Integer start, Integer length)` - Returns the characters starting at the given start index up to the start index plus the given length.
* `substring(Integer start, [Integer end])` - Returns the characters starting at the given start index up to the end index.
* `toBytes()` - Converts the string into an array of UTF-16 bytes.
* `toCharArray()` - Converts the string into an array of individual logical characters.
* `toCodePoints()` - Converts the string into an array of codepoints.
* `toLowerCase()` - Returns a new string in which all the alphabetic characters are converted to lower case, without modifying the original string.
* `toString()` - Returns the original (unsafe) string the object is hiding.
* `toUpperCase()` - Returns a new string in which all the alphabetic characters are converted to upper case, without modifying the original string.
* `trim()` - Removes whitespace from both ends of the string and returns a new string, without modifying the original string.
* `trimEnd()` - Removes whitespace from the end of the string and returns a new string, without modifying the original string.
* `trimLeft()` - Same as `trimStart`.
* `trimRight()` - Same as `trimEnd`.
* `trimStart()` - Removes whitespace from the beginning of the string and returns a new string, without modifying the original string.
### UtfString static methods
Additionally the class offers static methods in case you want to keep working with strings directly:
* `bytesToString(Array arr)` - Converts an array of UTF-16 bytes into a string.
* `charAt(String str, Integer index)` - Returns the character at the given index.

@@ -59,28 +148,40 @@

* `fromCharCode(Integer codepoint)` - Returns the string for the given Unicode codepoint.
* `codePointsToString(Array arr)` - Converts an array of codepoints into a string.
* `findByteIndex(String str, Integer charIndex)` - Finds the byte index for the given character index.
* `findCharIndex(String str, Integer byteIndex)` - Finds the character index for the given byte index.
* `fromBytes(Array arr)` - Converts an array of UTF-16 bytes into a UTF-safe string object.
* `fromCharCode(Integer codepoint)` - Returns a UTF-safe string object for the given Unicode codepoint.
* `fromCodePoints(Array arr)` - Converts an array of codepoints into a UTF-safe string object.
* `indexOf(String str, String searchValue, [Integer start])` - Finds the first instance of the search value within the string. Starts at an optional offset.
* `lastIndexOf(Str string, string searchValue, [Integer start])` - Finds the last instance of the search value within the string. Starts searching backwards at an optional offset, which can be negative.
* `join(Array arr, [String seperator])` - Concatenates the strings from the given array into a new UTF-safe string object.
* `slice(String str, Integer start, Integer finish)` - Returns the characters between the two given indices.
* `lastIndexOf(Str string, String searchValue, [Integer start])` - Finds the last instance of the search value within the string. Starts searching backwards at an optional offset, which can be negative.
* `substr(String str, Integer start, Integer length)` - Returns the characters starting at the given start index up to the start index plus the given length. Also aliased as `substring`.
* `lengthOf(String str)` - Returns the number of logical characters in the given string.
* `length(String str)` - Returns the number of logical characters in the given string.
* `padEnd(String str, Integer targetLength, [String padString])` - Creates a new string by padding the string with a given string (repeated, if needed) so that the resulting string reaches a given length. The padding is applied at the end of the string.
* `stringToCodePoints(String str)` - Converts a string into an array of codepoints.
* `padStart(String str, Integer targetLength, [String padString])` - Creates a new string by padding the string with a given string (repeated, if needed) so that the resulting string reaches a given length. The padding is applied at the start of the string.
* `codePointsToString(Array arr)` - Converts an array of codepoints into a string.
* `slice(String str, Integer start, Integer finish)` - Returns the characters between the two given indices.
* `stringFromCharCode(Integer codepoint)` - Returns the string for the given Unicode codepoint.
* `stringToBytes(String str)` - Converts a string into an array of UTF-16 bytes.
* `bytesToString(Array arr)` - Converts an array of UTF-16 bytes into a string.
* `stringToCharArray(String str)` - Converts the given string into an array of individual logical characters. Note that each entry in the returned array may be more than one UTF-16 character.
* `findByteIndex(String str, Integer charIndex)` - Finds the byte index for the given character index. Note: a "byte index" is really a "JavaScript string index", not a true byte offset. Use this function to convert a UTF character boundary to a JavaScript string index.
* `stringToCodePoints(String str)` - Converts a string into an array of codepoints.
* `findCharIndex(String str, Integer byteIndex)` - Finds the character index for the given byte index. Note: a "byte index" is really a "JavaSciprt string index", not a true byte offset. Use this function to convert a JavaScript string index to (the closest) UTF character boundary.
* `substr(String str, Integer start, Integer length)` - Returns the characters starting at the given start index up to the start index plus the given length.
* `substring(String str, Integer start, [Integer end])` - Returns the characters starting at the given start index up to the end index.
## Regional Indicators

@@ -90,7 +191,7 @@

Since regional indicators are semantically individual Unicode code points and because utfstring is a dependency of other Unicode-aware libraries, it doesn't make sense for utfstring to treat two regional indicators as a single character by default. That said, it can be useful to treat them as such from a display or layout perspective. In order to support both scenarios, two implementations are necessary. The first and default implementation is available via the instructions above. For visual grapheme clustering such as the grouping of regional indicators, use the `visual` property on `UtfString`. Display-aware versions of all the functions described above are available. The difference can be seen by way of the `length` function:
Since regional indicators are semantically individual Unicode code points and because utfstring is a dependency of other Unicode-aware libraries, it doesn't make sense for utfstring to treat two regional indicators as a single character by default. That said, it can be useful to treat them as such from a display or layout perspective. In order to support both scenarios, two implementations are necessary. The first and default implementation is available via the instructions above. For visual grapheme clustering such as the grouping of regional indicators, use the class `UtfVisualString`. Display-aware versions of all the functions described above are available. The difference can be seen by way of the `lengthOf` method:
```javascript
UtfString.visual.length("🇫🇷"); // 1
UtfString.length("🇫🇷"); // 2
UtfVisualString.lengthOf("🇫🇷"); // 1
UtfString.lengthOf("🇫🇷"); // 2
```

@@ -100,3 +201,3 @@

Tests are written in Jasmine and can be executed via [jasmine-node](https://github.com/mhevery/jasmine-node):
Unit tests are written in Mocha and can be executed via:

@@ -108,3 +209,3 @@ 1. `npm install`

Written and maintained by Cameron C. Dutro ([@camertron](https://github.com/camertron)).
Written and maintained by Cameron C. Dutro ([@camertron](https://github.com/camertron)) and [contributors](https://github.com/camertron/utfstring/graphs/contributors).

@@ -111,0 +212,0 @@ ## License

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc