Launch Week Day 5: Introducing Reachability for PHP.Learn More →

Book a Demo Sign in

Book a Demo Sign in

pypi

Categories
Server
File Formats
HTML Parser

HTML Parser

markdown

Python implementation of John Gruber's Markdown.

markdown-parser

python-markdown

markdown-to-html

html5lib

HTML parser based on the WHATWG HTML specification

tinyhtml5

HTML parser based on the WHATWG HTML specification

htmldate

Fast and robust extraction of original and updated publication dates from URLs and web pages.

entity-extraction

html-extraction

metadata-extraction

selectolax

A fast HTML5 parser with CSS selectors, written in Cython, using Modest and Lexbor engines.

readability-lxml

fast html to text parser (article readability tool) with python 3 support

pyromark

Blazingly fast Markdown parser

html5rdf

HTML parser based on the WHATWG HTML specification

breadability

Port of Readability HTML parser in Python

htmd-py

Python bindings for the htmd Rust library, a fast HTML to Markdown converter

sec-parser

Parse SEC EDGAR HTML documents into a tree of elements that correspond to the visual structure of the document.

htmllistparse

Python parser for Apache/nginx-style HTML directory listing.

apache nginx listing fuse

djc-core-html-parser

HTML parser used by django-components written in Rust.

reliq

Python ctypes bindings for reliq

text-processing

markupever

The fast, most optimal, and correct HTML & XML parsing library.

html-table-parser-python3

A small and simple HTML table parser not requiring any external dependency.

comrak-ext

Extended Python bindings for the Comrak Rust library, a fast CommonMark/GFM parser

ssc-codegen

Python-dsl code converter to html parser for web scraping

fast-scrape

High-performance HTML parsing library for Python

lukeparser

The Style of Markdown with the Power of LaTeX.

html5-parser

Fast C based HTML 5 parsing for python

comrak

Python bindings for the Comrak Rust library, a fast CommonMark/GFM parser

html-parser

UNKNOWN

domonic

A Python DOM far beyond minidom, with HTML, SVG, events, web APIs, and a JavaScript-like runtime.

metadata-parser

A module to parse metadata out of urls and html documents

opengraph protocol facebook

advancedhtmlparser

A Powerful HTML Parser/Scraper/Validator/Formatter that constructs a modifiable, searchable DOM tree, and includes many standard JS DOM functions (getElementsBy*, appendChild, etc) and additional methods

getElementsByName

llama-index-packs-code-hierarchy

A node parser which can create a hierarchy of all code scopes in a directory.

whiskeysour

High-performance BeautifulSoup replacement written in Rust

html5lib-modern

HTML parser based on the WHATWG HTML specification

pykami

A python module that parses KAMI into HTML

rm-html5-parser

Fast C based HTML 5 parsing for python, fork of Kovid Goyal html5-parser

njsparser

A Python NextJS data parser from HTML

haruka-parser

A simple HTML Parser

bs2json

Convert bs4 Tags into Json

docx-parser-converter

A library for converting DOCX files to HTML and plain text

google-parser

Convert html to snippets

api-telegraph

Python SDK for Telegraph (telegra.ph) API with sync and async support

yandex-parser

Parse html content of Yandex

whatsapp-chat-exporter

A Whatsapp database parser that provides history of your Whatsapp conversations in HTML and JSON. Android, iOS, iPadOS, Crypt12, Crypt14, Crypt15 supported.

htmldom

HTML parser which can be used for web-scraping applications

blowdrycss

The atomic CSS compiler

blowdry blowdrycss css compiler pre-compiler pre-processor generator dry cascading style sheets html encoded class selector parser optimizer internet

nextjs-hydration-parser

A Python library for extracting and parsing Next.js hydration data from HTML content

html5

HTML parser based on the WHATWG HTML specification

html-parser-ai-mcp

Html Parser Ai automation via MCP. Includes extract links, extract text, validate html. By MEOK AI Labs.

html-template-parser

A parser for HTML templates.

groupdocs-parser-net

GroupDocs.Parser for Python via .NET is a powerful API designed for advanced document parsing, offering extensive features like text extraction, metadata retrieval, and image extraction across various document formats, including PDFs, Word, Excel, and PowerPoint.

extract-content

prop-request

HTTP request tool with a little functionality

crawler parser html

scrapery

Scrapery: A fast, lightweight library to scrape HTML, XML, and JSON using XPath, CSS selectors, and intuitive DOM navigation.

quick-crawler

A toolkit for quickly performing crawler functions

edwh-editorjs

EditorJS.py

Product

Package Alerts
Integrations
Docs
Pricing
FAQ
Roadmap
Changelog

About

About
Love
Blog
Glossary
CareersHiring
Send Feedback
Contact Us
System Status

Packages

Explore crates.io

Explore Chrome Web Store

Explore Packagist

Explore Go Modules

Explore Hugging Face Hub

Explore Maven Central

Explore Open VSX

Explore RubyGems.org

Stay in touch

Get open source security insights delivered straight into your inbox.

Enter your email

Terms
Privacy
Security

Made with ⚡️ by Socket Inc

U.S. Patent No. 12,346,443 & 12,314,394. Other pending.