Determine the East Asian Width of a Unicode character
Bindings for RE2: fast, safe alternative to backtracking regular expression engines.
Markdown parser, done right. 100% Commonmark support, extensions, syntax plugins, high speed - all in one.
Promptbook: Run AI apps in plain human language across multiple models and platforms
Elegant console output, borrowed from Yarn
Javascript markdown parsing, made simple
Teams Toolkit CLI is a text-based command line interface that can help scaffold, validate, and deploy applications for Microsoft Teams from the terminal or a CI/CD process.
Webpack plugin to use in addition to [extract-text-webpack-plugin](https://github.com/webpack/extract-text-webpack-plugin) to create a second css bundle, processed to be rtl.
MCP server for terminal operations and file editing
DevExpress Rich Text Editor is an advanced word-processing tool designed for working with rich text documents.
The Retrieval-Augmented Generation (RAG) module contains document processing and embedding utilities.
Helps to prevent widow words in a text
Core engine to convert extended MDAST to DOCX. Supports plugins for footnotes, images, lists, tables, and more. Designed for seamless Markdown-to-DOCX conversion.
Extended MDAST types and custom node data for mdast2docx with support for DOCX formatting.
Count the number of OpenAI tokens in a string. Supports all OpenAI Text models (text-davinci-003, gpt-3.5-turbo, gpt-4)
Plugin to convert Markdown tables (MDAST) to DOCX with support for rich formatting and seamless integration into mdast2docx.
Plugin to convert mathematical expressions in Markdown (MDAST) to DOCX using LaTeX-style syntax. Integrates seamlessly with mdast2docx.
Plugin to convert ordered and unordered lists from Markdown (MDAST) to DOCX. Supports nesting, custom bullets, and numbering styles.
MDAST to DOCX plugin for resolving and embedding images. Supports base64, URLs, and custom resolvers for seamless DOCX image integration.
Extend MDAST by parsing embedded HTML in Markdown. Converts HTML into structured MDAST nodes compatible with @m2d/core for DOCX generation.
TeamsFx CLI a text-based command line interface that can help scaffold, validate, and deploy applications for Microsoft Teams from the terminal or a CI/CD process.
Convert Markdown Abstract Syntax Tree (MDAST) to DOCX seamlessly. Supports footnotes, images, links, and customizable document properties.
A plugin for @m2d/core that parses emoji shortcodes like :smile: and replaces them with their corresponding Unicode emoji characters for DOCX output.
A web SDK for word processing and rich text capabilities.
N8N Tools - Document Processor: Process and analyze documents with OCR, text extraction, and format conversion
[![github actions][actions-image]][actions-url] [![coverage][codecov-image]][codecov-url] [![dependency status][deps-svg]][deps-url] [![dev dependency status][dev-deps-svg]][dev-deps-url] [![License][license-image]][license-url] [![Downloads][downloads-im
Plugin for Remarkable to process embedded math expressions in Markdown text.
Configurable BM25 Text Search Engine with simple semantic search support
A highly efficient, isomorphic, full-featured, multilingual text search engine library, providing full-text search, fuzzy matching, phonetic scoring, document indexing and more, with micro JSON state hydration/dehydration in-browser and server-side.
A convertor between XML text and Javascript object / JSON text. Forked to add Graphite specific features.
Parsing Library for Typescript and Javascript.
novel-segment segment data
Util collection for Japanese text processing. Hiraganize, Katakanize, and Romanize.
🔪 chunk/split a string by length without cutting/truncating words.
Promptbook: Run AI apps in plain human language across multiple models and platforms
A unified plugin to prepare MDAST trees for DOCX conversion using mdast2docx.
MCP Document Converter Server — A Model Context Protocol server for seamless document format conversion and processing
Chinese word segmentation 簡繁中文分词模块 以網路小說為樣本
Node PDF is a set of tools that takes in PDF files and converts them to usable formats for data processing. The library supports both extracting text from searchable pdf files as well as performing OCR on pdfs which are just scanned images of text
Semantically create chunks from large texts. Useful for workflows involving large language models (LLMs).