HTML parser based on the WHATWG HTML specification
Fast and robust extraction of original and updated publication dates from URLs and web pages.
A fast HTML5 parser with CSS selectors, written in Cython, using Modest and Lexbor engines.
fast html to text parser (article readability tool) with python 3 support
High-performance HTML to Markdown converter powered by Rust with a clean Python API
HTML parser based on the WHATWG HTML specification
HTML parser used by django-components written in Rust.
Port of Readability HTML parser in Python
Python parser for Apache/nginx-style HTML directory listing.
A small and simple HTML table parser not requiring any external dependency.
Parse SEC EDGAR HTML documents into a tree of elements that correspond to the visual structure of the document.
Extended Python bindings for the Comrak Rust library, a fast CommonMark/GFM parser
The fast, most optimal, and correct HTML & XML parsing library.
Python ctypes bindings for reliq
HTML to Markdown converter
UNKNOWN
A module to parse metadata out of urls and html documents
Fast C based HTML 5 parsing for python
HTML parser based on the WHATWG HTML specification
A library for converting DOCX documents to HTML and plain text
Lightning-fast HTML parser and data extractor with WebPage API - BeautifulSoup alternative built in Rust
Universal metadata extraction library supporting 13 formats (HTML Meta, Open Graph, Twitter Cards, JSON-LD, Microdata, Microformats, RDFa, Dublin Core, Web App Manifest, oEmbed, rel-links, Images, SEO) with 7 language bindings
HTML parser based on the WHATWG HTML specification
A Python library for extracting and parsing Next.js hydration data from HTML content
A simple HTML Parser
Convert html to snippets
The Style of Markdown with the Power of LaTeX.
Finite-State Markdown Engine - O(N) single-pass Markdown to HTML converter using pure FST
Python bindings for Gumbo HTML parser
A parser that parses articles from any url or html
Web Crawler, HTML Parser, and Data Visualization
A Python NextJS data parser from HTML
EditorJS.py
Pure-Python HTML parser with ElementTree support.
A node parser which can create a hierarchy of all code scopes in a directory.
A toolkit for quickly performing crawler functions