Socket
Book a DemoInstallSign in
Socket

@shelf/text-normalizer

Package Overview
Dependencies
Maintainers
56
Versions
8
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

@shelf/text-normalizer

Text normalizer initially done for openai/whisper but ported to TS with love by shelf.io!

1.0.3
npmnpm
Version published
Weekly downloads
18
-92.14%
Maintainers
56
Weekly downloads
 
Created
Source

text-normalizer CircleCI

Originally took from openai/whisperer and rewrote to TS

TypeScript library for normalizing English text. It provides a utility class EnglishTextNormalizer with methods for normalizing various types of text, such as contractions, abbreviations, and spacing. EnglishTextNormalizer consists of other classes you can reuse independently:

  • EnglishSpellingNormalizer - uses a dictionary of English words and their American spelling. The dictionary is stored in a JSON file named english.json
  • EnglishNumberNormalizer - works specifically to normalize text from English words to actually numbers
  • BasicTextNormalizer - provides methods for removing special characters and diacritics from text, as well as splitting words into separate letters.

Install

$ yarn add @shelf/text-normalizer

Usage

import {EnglishTextNormalizer} from '@shelf/text-normalizer'

const normalizer = new EnglishTextNormalizer()

console.log(normalizer.normalize("Let's")); // Output: let us
console.log(normalizer.normalize("he's like")); // Output: he is like
console.log(normalizer.normalize("she's been like")); // Output: she has been like
console.log(normalizer.normalize('10km')); // Output: 10 km
console.log(normalizer.normalize('10mm')); // Output: 10 mm
console.log(normalizer.normalize('RC232')); // Output: rc 232
console.log(
  normalizer.normalize('Mr. Park visited Assoc. Prof. Kim Jr.')
); // Output: mister park visited associate professor kim junior

Publish

$ git checkout master
$ yarn version
$ yarn publish
$ git push origin master --tags

License

MIT © Shelf

FAQs

Package last updated on 10 Apr 2023

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

About

Packages

Stay in touch

Get open source security insights delivered straight into your inbox.

  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc

U.S. Patent No. 12,346,443 & 12,314,394. Other pending.