New Research: Supply Chain Attack on Axios Pulls Malicious Dependency from npm.Details →
Socket
Book a DemoSign in
Socket

word2md

Package Overview
Dependencies
Maintainers
1
Versions
4
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

word2md

Convert .docx files to Markdown with image extraction.

latest
Source
npmnpm
Version
1.0.3
Version published
Weekly downloads
9
12.5%
Maintainers
1
Weekly downloads
 
Created
Source

word2md

📖 中文文档 (Chinese Documentation)

A simple and easy-to-use command-line tool for converting Microsoft Word documents (.docx) to Markdown format with automatic image extraction and saving.

Features

  • 🚀 Support for single file and batch conversion
  • 📸 Automatic extraction and saving of images from documents
  • 📝 Preserves original document formatting structure
  • 🎯 Simple command-line interface
  • 📦 Support for npx one-click execution without installation

Installation

No installation required, run directly:

npx word2md <input> [options]

Global Installation

npm install -g word2md

Local Installation

npm install word2md

Usage

Basic Usage

Convert Single File

# Generate markdown file in the same directory as the source file
npx word2md document.docx

# Specify output directory
npx word2md document.docx -o ./output

Batch Convert Directory

# Convert all .docx files in directory to output subdirectory
npx word2md ./docs

# Specify output directory
npx word2md ./docs -o ./converted

Command Line Options

word2md - Convert Word documents (.docx) to Markdown

Usage:
  npx word2md <input> [options]

Arguments:
  <input>    Path to a .docx file or directory containing .docx files

Options:
  -o, --output <dir>    Output directory (default: same as input for files, ./output for directories)
  -h, --help           Show this help message
  -v, --version        Show version

Examples:
  npx word2md document.docx                    # Convert single file
  npx word2md ./docs                          # Convert all .docx files in directory
  npx word2md document.docx -o ./markdown     # Convert to specific output directory
  npx word2md ./docs -o ./converted           # Batch convert to specific directory

Output Structure

The converted file structure is as follows:

output/
├── document.md          # Converted Markdown file
└── images/             # Extracted images directory
    ├── image-uuid1.png
    ├── image-uuid2.jpg
    └── ...
  • Image references in Markdown files are automatically updated to relative paths: images/image-uuid.ext
  • Image filenames use UUIDs to ensure uniqueness
  • Supports common image formats: PNG, JPG, JPEG, GIF, etc.

Examples

Convert Single File

$ npx word2md report.docx
✅ Converted: D:\docs\report.docx → D:\docs\report.md

Batch Convert

$ npx word2md ./documents
Found 3 .docx file(s) to convert...
✅ Converted: report1.docx → report1.md
✅ Converted: report2.docx → report2.md
✅ Converted: manual.docx → manual.md

Specify Output Directory

$ npx word2md report.docx -o ./markdown
✅ Converted: D:\docs\report.docx → D:\markdown\report.md

Technical Implementation

This tool is built with the following tech stack:

  • mammoth: For parsing .docx files and extracting content and images
  • turndown: For converting HTML to Markdown
  • uuid: For generating unique image filenames
  • TypeScript: Type-safe development experience
  • Node.js: Cross-platform runtime environment

System Requirements

  • Node.js >= 16.0.0
  • Supports Windows, macOS, Linux

Development

Clone the Project

git clone https://github.com/okfred/word2md.git
cd word2md

Install Dependencies

npm install

Development Mode

npm run dev

Build

npm run build

Test

npm test

Contributing

Issues and Pull Requests are welcome!

  • Fork the project
  • Create a feature branch (git checkout -b feature/AmazingFeature)
  • Commit your changes (git commit -m 'Add some AmazingFeature')
  • Push to the branch (git push origin feature/AmazingFeature)
  • Open a Pull Request

License

This project is licensed under the MIT License. See the LICENSE file for details.

Changelog

v1.0.0

  • 🎉 Initial release
  • ✨ Support for single file and batch conversion
  • 📸 Automatic image extraction and saving
  • 🚀 Support for npx one-click execution

FAQ

Q: What file formats are supported?

A: Currently only supports .docx format. Legacy .doc format is not supported.

Q: Will image quality be lost?

A: No. Images are saved at original quality without any compression or processing.

Q: Can password-protected documents be converted?

A: Currently, password-protected .docx files are not supported.

Q: What to do when encountering memory issues with large files?

A: For particularly large files, consider increasing Node.js memory limit:

node --max-old-space-size=4096 $(which npx) word2md large-file.docx

If this tool helps you, please give it a ⭐️ for support!

Keywords

docx

FAQs

Package last updated on 12 Aug 2025

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts