Yet another library to extract text from MS Office and PDF files
docx parser
Javascript SDK for Sensible, the developer-first platform for extracting structured data from documents so that you can build document-automation features into your SaaS products
Extend MDAST by parsing embedded HTML in Markdown. Converts HTML into structured MDAST nodes compatible with @m2d/core for DOCX generation.
A simple library that converts .docx files to plain text in the browser
TypeScript definitions and functions for using Docling output.
Extracts comments and other data from docx files
A NodeJS library to parse pdf, txt, doc and docx files to JSON and CSV
Web components for displaying Docling output.
This npm package offers a straightforward method to extract text content from various binary and text file formats. The package comes with a pre-built configuration that works out-of-the-box, requiring no additional setup. It is designed for use in Browse
Extend MDAST by parsing embedded HTML in Markdown. Converts HTML into structured MDAST nodes compatible with @m2d/core for DOCX generation.
Convert documents to markdown text content. Originally inspired by microsoft's markitdown python library.
Docx parser for JavaScript/TypeScript
Extend MDAST by parsing embedded HTML in Markdown. Converts HTML into structured MDAST nodes compatible with @m2d/core for DOCX generation.
A node script which can fill DOCX placeholders and convert to PDFs
Fork of office-text-extractor with unreleased changes that include browser support
A lightweight library to parse .docx files in Cloudflare Workers
A NodeJS library to parse pdf, txt, doc and docx files to JSON and CSV
A Text extracting package docx, pdf and pptx files
Javascript port of python-docx.
Yet another library to extract text from MS Office and PDF files
> **Note** > This repository is automatically generated from the [main parser monorepo](https://github.com/TrialAndErrorOrg/parsers). Please submit any issues or pull requests there.
A NodeJS library to parse pdf, txt, doc and docx files to JSON and CSV
A dead simple docx parser.
Primitives for building and extending readers, including definitions for context, parsers, modes, commands, plugins, and more.
The Structured Parser JS/TS SDK allows developers to easily integrate Structured Parser's advanced structured data extraction capabilities from unstructured documents such as PDF, DOCX, XLSX.
Plain text parser that allows readers to extract and process words from plain text or `.txt` files.