Python implementation of John Gruber's Markdown.
HTML parser based on the WHATWG HTML specification
Fast and robust extraction of original and updated publication dates from URLs and web pages.
A fast HTML5 parser with CSS selectors, written in Cython, using Modest and Lexbor engines.
fast html to text parser (article readability tool) with python 3 support
HTML parser based on the WHATWG HTML specification
Port of Readability HTML parser in Python
Parse SEC EDGAR HTML documents into a tree of elements that correspond to the visual structure of the document.
Extended Python bindings for the Comrak Rust library, a fast CommonMark/GFM parser
Python parser for Apache/nginx-style HTML directory listing.
HTML parser used by django-components written in Rust.
Python ctypes bindings for reliq
TestQL with endpoint detection, OpenAPI, SUMD generation, SUMD parser and HTML report generation
The fast, most optimal, and correct HTML & XML parsing library.
A small and simple HTML table parser not requiring any external dependency.
Python-dsl code converter to html parser for web scraping
The Style of Markdown with the Power of LaTeX.
High-performance HTML parsing library for Python
Fast C based HTML 5 parsing for python
UNKNOWN
A module to parse metadata out of urls and html documents
A Powerful HTML Parser/Scraper/Validator/Formatter that constructs a modifiable, searchable DOM tree, and includes many standard JS DOM functions (getElementsBy*, appendChild, etc) and additional methods
High-performance BeautifulSoup replacement written in Rust
HTML parser based on the WHATWG HTML specification
Fast C based HTML 5 parsing for python, fork of Kovid Goyal html5-parser
A Python NextJS data parser from HTML
A Python library for extracting and parsing Next.js hydration data from HTML content
Parse html content of Yandex
HTML parser based on the WHATWG HTML specification
Convert html to snippets
A node parser which can create a hierarchy of all code scopes in a directory.
EditorJS.py
GroupDocs.Parser for Python via .NET is a powerful API designed for advanced document parsing, offering extensive features like text extraction, metadata retrieval, and image extraction across various document formats, including PDFs, Word, Excel, and PowerPoint.
HTTP request tool with a little functionality
A simple HTML Parser
Scrapery: A fast, lightweight library to scrape HTML, XML, and JSON using XPath, CSS selectors, and intuitive DOM navigation.
A toolkit for quickly performing crawler functions