New Research: Supply Chain Attack on Axios Pulls Malicious Dependency from npm.Details →

Book a Demo Sign in

htmless

Package Overview

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

htmless

CLI tool to clean and minify HTML by removing scripts, styles, and attributes — optimized for LLM input.

latest

Source

npm

Version: 1.0.0

Version published: 12 months ago

Weekly downloads: 2

Maintainers: 1

Weekly downloads

Created: 12 months ago

Source

htmless

npm version npm downloads license types

Lighten your HTML input. Keep the meaning, ditch the weight.

🧠 What is it?

htmless is a minimalist CLI tool that strips HTML down to the bone — removing unnecessary scripts, styles, attributes, and utility classes. The result is a clean, minified HTML output, ideal for feeding into LLMs where every token counts.

🤔 Why was it created?

I needed to extract semantically valuable content from HTML pages and send it to AI models. But raw HTML is full of bloat — especially utility classes from frameworks like Tailwind, inline styles, scripts, and other things that eat tokens without adding real value.

The goals were simple:

Preserve document structure – headings, paragraphs, text emphasis
Keep href attributes on <a> tags – they carry semantic meaning and useful context
Eliminate noise
Make it fast, simple, and automatable
Follow the Unix philosophy — do one thing and do it well

🔧 Installation

pnpm add -g htmless
# or
npm install -g htmless

🚀 Usage

cat input.html | htmless

Use it in a bash pipeline, before LLM processing, or to clean up WYSIWYG HTML exports.

💡 Example

Input:

<div class="bg-white p-4 text-sm text-gray-700">
  <h1 class="text-3xl font-bold">Welcome</h1>
  <p>This is a <strong>test</strong>.</p>
  <script>alert('Hi')</script>
  <style>body { background: red; }</style>
</div>

Output:

<div><h1>Welcome</h1><p>This is a <strong>test</strong>.</p></div>

🛠️ What gets removed?

all HTML attributes (class, id, style, data-*, etc.)
<script> and <style> blocks
comments and whitespace
(exception: href on <a> is preserved)

🔎 Who is this for?

developers working with LLMs and prompt engineering
anyone who needs to get meaningful content from HTML without the fluff
scripting, scraping, automation pipelines

🧪 Tech info

built on top of htmlparser2 — fast and robust
outputs valid HTML (not plaintext)
written in TypeScript, clean CLI with commander

🧘 Philosophy

Less is more. Tokens are expensive. htmless helps LLMs process content, not the wrapper.

👤 Author

Made with ❤️ by BroJor

📄 License

Keywords

FAQs

What is htmless?

Is htmless popular?

Is htmless well maintained?

Package last updated on 15 Apr 2025

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

htmless

htmless

🧠 What is it?

🤔 Why was it created?

🔧 Installation

🚀 Usage

💡 Example

Input:

Output:

🛠️ What gets removed?

🔎 Who is this for?

🧪 Tech info

🧘 Philosophy

👤 Author

📄 License

Keywords

Related posts

Don't Kill the Goose That Lays the Golden Eggs

Feross on TBPN: How North Korea Hijacked Axios