
Security News
Attackers Are Hunting High-Impact Node.js Maintainers in a Coordinated Social Engineering Campaign
Multiple high-impact npm maintainers confirm they have been targeted in the same social engineering campaign that compromised Axios.
html-docxjs-compiler
Advanced tools
A powerful and flexible TypeScript library that converts HTML strings into DOCXjs XmlComponent format. Built on top of the excellent docx library, this package parses HTML using cheerio and transforms it into XmlComponent objects that can be seamlessly integrated with the docx API.
npm install html-docxjs-compiler docx
This package requires docx as a peer dependency:
npm install docx@^9.5.0
import { transformHtmlToDocx } from 'html-docxjs-compiler';
import { Document, Packer } from 'docx';
import * as fs from 'fs';
async function createDocument() {
const html = `
<h1>My Document</h1>
<p>This is a paragraph with <strong>bold</strong> and <em>italic</em> text.</p>
<ul>
<li>First item</li>
<li>Second item</li>
</ul>
`;
// Transform HTML to DOCX elements
const elements = await transformHtmlToDocx(html);
// Create a document with the elements
const doc = new Document({
sections: [{
children: elements
}]
});
// Generate and save the document
const buffer = await Packer.toBuffer(doc);
fs.writeFileSync('output.docx', buffer);
}
createDocument();
The library uses a three-stage process:
XmlComponent objects (Paragraph, TextRun, Table, ImageRun, etc.)XmlComponent[] that can be used directly in the docx Document APIHTML String → cheerio Parser → Element Handlers → XmlComponent[] → docx Document
All configuration options are completely optional:
strategyManager configuration// Simple usage - works immediately
const elements = await transformHtmlToDocx('<p>Hello World</p>');
// With image URL support - requires configuration
import { ImageDownloadStrategyManager, HttpImageDownloadStrategy } from 'html-docxjs-compiler';
const htmlWithImages = await transformHtmlToDocx('<p><img src="imageurl...">></p>');
const strategyManager = new ImageDownloadStrategyManager([
new HttpImageDownloadStrategy()
]);
const elements = await transformHtmlToDocx(htmlWithImages, { strategyManager });
import { transformHtmlToDocx } from 'html-docxjs-compiler';
import { Document, Packer } from 'docx';
const html = `
<h1>Project Report</h1>
<h2>Executive Summary</h2>
<p style="text-align: center; color: #333333;">
This report provides an overview of our <strong>Q4 2024</strong> performance.
</p>
<h3>Key Highlights</h3>
<ul>
<li>Revenue increased by <strong>25%</strong></li>
<li>Customer satisfaction: <em>95%</em></li>
<li>New product launch was <u>successful</u></li>
</ul>
`;
async function generateReport() {
const elements = await transformHtmlToDocx(html);
const doc = new Document({
sections: [{
children: elements
}]
});
const buffer = await Packer.toBuffer(doc);
// Save or send buffer...
}
const html = `
<h2>Sales Data</h2>
<table>
<thead>
<tr>
<th style="background-color: #4472C4; color: white;">Product</th>
<th style="background-color: #4472C4; color: white;">Q3</th>
<th style="background-color: #4472C4; color: white;">Q4</th>
</tr>
</thead>
<tbody>
<tr>
<td>Widget A</td>
<td style="text-align: right;">$50,000</td>
<td style="text-align: right;">$65,000</td>
</tr>
<tr>
<td>Widget B</td>
<td style="text-align: right;">$30,000</td>
<td style="text-align: right;">$42,000</td>
</tr>
</tbody>
</table>
`;
const elements = await transformHtmlToDocx(html);
const html = `
<h1>Product Catalog</h1>
<p>Check out our latest products:</p>
<img src="https://example.com/product-image.jpg" alt="Product" />
<p>Available in multiple colors.</p>
`;
const elements = await transformHtmlToDocx(html);
const html = `
<img src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAUA..." />
<p>Company Logo</p>
`;
const elements = await transformHtmlToDocx(html);
The library uses a Strategy Pattern for image downloads, allowing you to customize how images are fetched from different sources.
import { ImageDownloadStrategy } from 'html-docxjs-compiler';
import axios from 'axios';
// Custom strategy for S3 signed URLs
class S3ImageDownloadStrategy implements ImageDownloadStrategy {
canHandle(url: string): boolean {
return url.includes('s3.amazonaws.com') || url.includes('s3-');
}
async download(url: string): Promise<string> {
const response = await axios.get(url, {
responseType: 'arraybuffer',
headers: {
'Authorization': 'Bearer YOUR_TOKEN' // Add custom headers
}
});
const base64 = Buffer.from(response.data, 'binary').toString('base64');
return `data:image/png;base64,${base64}`;
}
}
// Use your custom strategy
const s3Strategy = new S3ImageDownloadStrategy();
const strategyManager = new ImageDownloadStrategyManager([s3Strategy]);
const elements = await transformHtmlToDocx(html, { strategyManager });
import {
ImageDownloadStrategyManager,
HttpImageDownloadStrategy,
} from 'html-docxjs-compiler';
// Strategies are tried in order until one can handle the URL
const strategyManager = new ImageDownloadStrategyManager([
new FirebaseImageDownloadStrategy('firebase-bucket.appspot.com'),
new S3ImageDownloadStrategy(),
new HttpImageDownloadStrategy() // Fallback for any HTTP/HTTPS URL
]);
const elements = await transformHtmlToDocx(html, { strategyManager });
| Element | Description | Styling Support |
|---|---|---|
h1 - h6 | Headings (converted to DOCX heading styles) | ✅ |
p | Paragraphs | ✅ text-align, color, etc. |
div | Division container | ✅ |
ul, ol | Unordered/Ordered lists | ✅ Nested lists supported |
li | List items | ✅ |
table | Tables | ✅ |
tr | Table rows | ✅ |
td, th | Table cells/headers | ✅ colspan, rowspan, background-color, vertical-align |
thead, tbody | Table sections | ✅ |
| Element | Description | Styling Support |
|---|---|---|
strong, b | Bold text | ✅ |
em, i | Italic text | ✅ |
u | Underlined text | ✅ |
s | Strikethrough text | ✅ |
sub | Subscript | ✅ |
sup | Superscript | ✅ |
span | Inline container | ✅ color, background-color, etc. |
a | Hyperlinks | ✅ Creates clickable links |
br | Line break | ✅ |
img | Images | ✅ Auto-resize, multiple sources |
#FF0000, red, darkblue)left, center, right, justifytop, middle, bottom (table cells)Images are automatically resized to fit within these constraints while maintaining aspect ratio:
data:image/png;base64,...)strategyManager with appropriate strategiesImageDownloadStrategy interfaceNote: If no strategyManager is provided:
transformHtmlToDocx(html: string, options?: HtmlToDocxOptions): Promise<XmlComponent[]>Primary function to convert HTML to DOCX elements.
Parameters:
html (string): HTML string to convertoptions (optional): Configuration options
strategyManager (ImageDownloadStrategyManager, optional): Custom image download strategy manager
Returns:
Promise<XmlComponent[]>: Array of docx components ready to use in DocumentExample:
// Without images or with base64 images only
const elements = await transformHtmlToDocx('<p>Hello</p>');
// With URL-based image support
const strategyManager = new ImageDownloadStrategyManager([
new HttpImageDownloadStrategy()
]);
const elements = await transformHtmlToDocx('<p>Hello</p>', {
strategyManager
});
transformHtmlToDocxSimple(html: string, options?: HtmlToDocxOptions): Promise<XmlComponent[]>Simplified transformation for basic text rendering (wraps all content in paragraphs).
Parameters:
transformHtmlToDocxReturns:
Promise<XmlComponent[]>Use Case: Simple text content without complex structure
textToDocx(text: string): Promise<XmlComponent[]>Converts plain text to DOCX, preserving line breaks as <br /> tags.
Parameters:
text (string): Plain text stringReturns:
Promise<XmlComponent[]>ImageDownloadStrategyManagerManages multiple image download strategies using Chain of Responsibility pattern.
Constructor:
new ImageDownloadStrategyManager(strategies?: ImageDownloadStrategy[])
Methods:
addStrategy(strategy: ImageDownloadStrategy): void - Add a new strategydownload(url: string): Promise<string> - Download image using first matching strategyHttpImageDownloadStrategyDefault strategy for HTTP/HTTPS URLs.
Methods:
canHandle(url: string): boolean - Returns true for http:// or https:// URLsdownload(url: string): Promise<string> - Downloads and returns base64 data URIImageDownloadStrategy InterfaceImplement this interface to create custom image download strategies.
interface ImageDownloadStrategy {
canHandle(url: string): boolean;
download(url: string): Promise<string>;
}
The docx library uses XmlComponent objects to build Word documents. This package transforms HTML elements into these components:
// HTML
<p>Hello <strong>world</strong>!</p>
// Becomes
new Paragraph({
children: [
new TextRun({ text: "Hello " }),
new TextRun({ text: "world", bold: true }),
new TextRun({ text: "!" })
]
})
Paragraph - Block of text (from <p>, <div>, <h1>, etc.)TextRun - Styled text segment (from <span>, <strong>, etc.)Table - Table structure (from <table>)TableRow - Table row (from <tr>)TableCell - Table cell (from <td>, <th>)ImageRun - Embedded image (from <img>)ExternalHyperlink - Clickable link (from <a>)Contributions are welcome! Please feel free to submit issues or pull requests.
This project is dual-licensed:
Personal / Non-Commercial Use
Free under an MIT-style non-commercial license.
You can use it for personal, educational, and other non-commercial projects.
Commercial Use
Commercial use requires a paid license (per legal entity / organization).
See LICENSE and LICENSE-COMMERCIAL.md for full terms.
After purchase you receive a license key and your payment receipt, which together serve as proof of license.
Is your use commercial?
If you're using this in a business, company, SaaS, client work, or any for-profit context, you should obtain a commercial license.
Questions or edge cases? Email jankostevanovic@gmail.com.
FAQs
Convert HTML to DOCXjs elements.
The npm package html-docxjs-compiler receives a total of 12 weekly downloads. As such, html-docxjs-compiler popularity was classified as not popular.
We found that html-docxjs-compiler demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Security News
Multiple high-impact npm maintainers confirm they have been targeted in the same social engineering campaign that compromised Axios.

Security News
Axios compromise traced to social engineering, showing how attacks on maintainers can bypass controls and expose the broader software supply chain.

Security News
Node.js has paused its bug bounty program after funding ended, removing payouts for vulnerability reports but keeping its security process unchanged.