Determine the East Asian Width of a Unicode character
Markdown parser, done right. 100% Commonmark support, extensions, syntax plugins, high speed - all in one.
Elegant console output, borrowed from Yarn
Javascript markdown parsing, made simple
A module for node.js and the browser that takes in text and returns text that is stripped of stopwords. Has pre-defined stopword lists for 62 languages and also takes lists with custom stopwords as input.
Webpack plugin to use in addition to [extract-text-webpack-plugin](https://github.com/webpack/extract-text-webpack-plugin) to create a second css bundle, processed to be rtl.
Teams Toolkit CLI is a text-based command line interface that can help scaffold, validate, and deploy applications for Microsoft Teams from the terminal or a CI/CD process.
DevExpress Rich Text Editor is an advanced word-processing tool designed for working with rich text documents.
Helps to prevent widow words in a text
TeamsFx CLI a text-based command line interface that can help scaffold, validate, and deploy applications for Microsoft Teams from the terminal or a CI/CD process.
Count the number of OpenAI tokens in a string. Supports all OpenAI Text models (text-davinci-003, gpt-3.5-turbo, gpt-4)
Anonymize-NLP is a lightweight and robust package for text anonymization. It uses Natural Language Processing (NLP) and Regular Expressions (Regex) to identify and mask sensitive information in a string.,
Plugin for Remarkable to process embedded math expressions in Markdown text.
Basic library to roughly determine the language of input text
Util collection for Japanese text processing. Hiraganize, Katakanize, and Romanize.
Chinese word segmentation 簡繁中文分词模块 以網路小說為樣本
🔪 chunk/split a string by length without cutting/truncating words.
原版 node-segment 的格式
Configurable BM25 Text Search Engine with simple semantic search support
Node PDF is a set of tools that takes in PDF files and converts them to usable formats for data processing. The library supports both extracting text from searchable pdf files as well as performing OCR on pdfs which are just scanned images of text
Javascript SDK for Sensible, the developer-first platform for extracting structured data from documents so that you can build document-automation features into your SaaS products
Fast, easy-to-use AI text embeddings, optimized for serverless functions.
Multi Languages Detection for Text-Mining and Natural Language Processing - True ITK - Open Source
Naive Bayes Text Classifier
Semantically create chunks from large texts. Useful for workflows involving large language models (LLMs).
Node and Browser env supported WebAssembly version of fastText: Library for efficient text classification and representation learning.