pytidylib

Python wrapper for HTML Tidy (tidylib) on Python 2 and 3

0.3.2
PyPI

Maintainers: 1

PyTidyLib_ is a Python package that wraps the HTML Tidy_ library. This allows you, from Python code, to "fix" invalid (X)HTML markup. Some of the library's many capabilities include:

Clean up unclosed tags and unescaped characters such as ampersands
Output HTML 4 or XHTML, strict or transitional, and add missing doctypes
Convert named entities to numeric entities, which can then be used in XML documents without an HTML doctype.
Clean up HTML from programs such as Word (to an extent)
Indent the output, including proper (i.e. no) indenting for pre elements, which some (X)HTML indenting code overlooks.

Changes

0.3.2: Initialization bug fix
0.3.1: find_library support while still allowing a list of library names
0.3.0: Refactored to use Tidy and PersistentTidy classes while keeping the functional interface (which will lazily create a global Tidy() object) for backward compatibility. You can now pass a list of library names and base options when instantiating Tidy. The keep_doc argument is now deprecated and does nothing; use PersistentTidy.
0.2.4: Bugfix for a strange memory allocation corner case in Tidy.
0.2.3: Python 3 support (2 + 3 cross compatible) with passing Tox tests.

Small example of use

The following code cleans up an invalid HTML document and sets an option::

from tidylib import tidy_document
document, errors = tidy_document('''<p>f&otilde;o <img src="bar.jpg">''',
  options={'numeric-entities':1})
print document
print errors

Docs

Documentation is shipped with the source distribution and is available at the PyTidyLib_ web page.

.. _HTML Tidy: http://tidy.sourceforge.net/ .. _PyTidyLib: http://countergram.com/open-source/pytidylib/

FAQs

What is pytidylib?

Is pytidylib well maintained?

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

pytidylib

Changes

Small example of use

Docs

Related posts

Typosquatted Go Packages Deliver Malware Loader Targeting Linux and macOS Systems

Bybit Hack Puts Crypto Losses at $1.6B, Surpassing All of Last Year in Just Two Months