decancer
A library that removes common unicode confusables/homoglyphs from strings.
- Its core is written in Rust and utilizes a form of Binary Search to ensure speed!
- By default, it's capable of filtering 221,529 (19.88%) different unicode codepoints like:
- Unlike other packages, this package is unicode bidi-aware where it also interprets right-to-left characters in the same way as it were to be rendered by an application!
- Its behavior is also highly customizable to your liking!
Installation
In your shell:
npm install decancer
In your code (CommonJS):
const decancer = require('decancer')
In your code (ESM):
import decancer from 'decancer'
Examples
const assert = require('assert')
const cured = decancer('vEⓡ𝔂 𝔽𝕌Ňℕy ţ乇𝕏𝓣 wWiIiIIttHh l133t5p3/-\\|<')
assert(cured.equals('very funny text with leetspeak'))
assert(cured.toString() !== 'very funny text with leetspeak')
console.log(cured.toString())
assert(cured.contains('funny'))
cured.censor('funny', '*')
console.log(cured.toString())
cured.censorMultiple(['very', 'text'], '-')
console.log(cured.toString())
Donations
If you want to support my eyes for manually looking at thousands of unicode characters, consider donating! ❤