New Case Study:See how Anthropic automated 95% of dependency reviews with Socket.Learn More
Socket
Sign inDemoInstall
Socket

html-to-markdown

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

html-to-markdown

Convert HTML to markdown

  • 1.2.0
  • PyPI
  • Socket score

Maintainers
1

html_to_markdown

This library is a refactored and modernized fork of markdownify, supporting Python 3.9 and above.

Differences with the Markdownify

  • The refactored codebase uses a strict functional approach - no classes are involved.
  • There is full typing with strict MyPy strict adherence and a py.typed file included.
  • The convert_to_markdown function allows passing a pre-configured instance of BeautifulSoup instead of html.
  • This library releases follows standard semver. Its version v1.0.0 was branched from markdownify's v0.13.1, at which point versioning is no longer aligned.

Installation

pip install html_to_markdown

Usage

Convert an string HTML to Markdown:

from html_to_markdown import convert_to_markdown

convert_to_markdown('<b>Yay</b> <a href="http://github.com">GitHub</a>')  # > '**Yay** [GitHub](http://github.com)'

Or pass a pre-configured instance of BeautifulSoup:

from bs4 import BeautifulSoup
from html_to_markdown import convert_to_markdown

soup = BeautifulSoup('<b>Yay</b> <a href="http://github.com">GitHub</a>', 'lxml')  # lxml requires an extra dependency.

convert_to_markdown(soup)  # > '**Yay** [GitHub](http://github.com)'

Options

The convert_to_markdown function accepts the following kwargs:

  • autolinks (bool): Automatically convert valid URLs into Markdown links. Defaults to True.
  • bullets (str): A string of characters to use for bullet points in lists. Defaults to '*+-'.
  • code_language (str): Default language identifier for fenced code blocks. Defaults to an empty string.
  • code_language_callback (Callable[[Any], str] | None): Function to dynamically determine the language for code blocks.
  • convert (Iterable[str] | None): A list of tag names to convert to Markdown. If None, all supported tags are converted.
  • default_title (bool): Use the default title when converting certain elements (e.g., links). Defaults to False.
  • escape_asterisks (bool): Escape asterisks (*) to prevent unintended Markdown formatting. Defaults to True.
  • escape_misc (bool): Escape miscellaneous characters to prevent conflicts in Markdown. Defaults to True.
  • escapeunderscores (bool): Escape underscores () to prevent unintended italic formatting. Defaults to True.
  • heading_style (Literal["underlined", "atx", "atx_closed"]): The style to use for Markdown headings. Defaults to " underlined".
  • keep_inline_images_in (Iterable[str] | None): Tags in which inline images should be preserved. Defaults to None.
  • newline_style (Literal["spaces", "backslash"]): Style for handling newlines in text content. Defaults to "spaces".
  • strip (Iterable[str] | None): Tags to strip from the output. Defaults to None.
  • strongem_symbol (Literal["*", ""]): Symbol to use for strong/emphasized text. Defaults to "*".
  • sub_symbol (str): Custom symbol for subscript text. Defaults to an empty string.
  • sup_symbol (str): Custom symbol for superscript text. Defaults to an empty string.
  • wrap (bool): Wrap text to the specified width. Defaults to False.
  • wrap_width (int): The number of characters at which to wrap text. Defaults to 80.
  • convert_as_inline (bool): Treat the content as inline elements (no block elements like paragraphs). Defaults to False.

CLI

For compatibility with the original markdownify, a CLI is provided. Use html_to_markdown example.html > example.md or pipe input from stdin:

cat example.html | html_to_markdown > example.md

Use html_to_markdown -h to see all available options. They are the same as listed above and take the same arguments.

Keywords

FAQs


Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc