code-point-mapping

Map between javascript string indices and unicode code point offsets effectively

0.2.0
latest
Source
npm

Version published: 2 years ago

Weekly downloads: 1; decreased by-85.71%

Maintainers: 1

Weekly downloads

Created: 2 years ago

Source

code-point-mapping provides a way to map between utf16 string indices and unicode code point offsets effectively.

Unicode code points require either one or two utf16 code units to represent them. Characters outside the Basic Multilingual Plane are represented as two surrogate pairs. This means as soon as you use characters (like Emoji) that are in this state, you need to do some work to map between utf16 indexes and unicode code point offsets.

This package was designed for use with automerge, which requires that you specify offsets in terms of unicode code points, and so only the APIs I needed to make that work are here.

For example:

import CodePointMapping from 'code-point-mapping'
import * as automerge from '@automerge/automerge'

let doc1 = automerge.from({ str: new automerge.Text('😀🎉✈️') })
let cpm = new CodePointMapping(doc1.str)

cpm.indexForCodepoint(1) // => 2

doc1 = automerge.change(doc1, d => {
  d.str.deleteAt(...cpm.deleteAt(0, 2)) // d.str.deleteAt(0, 1)
  d.str.insertAt(...cpm.insertAt(2, '🧟‍♀️')) // d.str.insertAt(1, ..."🧟‍♀️")
})

NOTE: This library assumes that your strings are valid unicode and do not contain unpaired surrogates.

FAQs

What is code-point-mapping?

Is code-point-mapping popular?

Is code-point-mapping well maintained?

Package last updated on 02 Jun 2023

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

code-point-mapping

Related posts

PyPI’s New Archival Feature Closes a Major Security Gap

North Korean APT Lazarus Targets Developers with Malicious npm Package