New Case Study:See how Anthropic automated 95% of dependency reviews with Socket.Learn More
Socket
Sign inDemoInstall
Socket

djc-core-html-parser

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

djc-core-html-parser

HTML parser used by django-components written in Rust.

1.0.1
PyPI
Maintainers
1

djc-core-html-parser

HTML parser used by django-components. Written in Rust, exposed as a Python package with maturin.

This implementation was found to be 40-50x faster than our Python implementation, taking ~90ms to parse 5 MB of HTML.

Installation

pip install djc-core-html-parser

Usage

from djc_core_html_parser import set_html_attributes

html = '<div><p>Hello</p></div>'
result, _ = set_html_attributes(
  html,
  # Add attributes to the root elements
  root_attributes=['data-root-id'],
  # Add attributes to all elements
  all_attributes=['data-v-123'],
)

To save ourselves from re-parsing the HTML, set_html_attributes returns not just the transformed HTML, but also a dictionary as the second item.

This dictionary contains a record of which HTML attributes were written to which elemenents.

To populate this dictionary, you need set watch_on_attribute to an attribute name.

Then, during the HTML transformation, we check each element for this attribute. And if the element HAS this attribute, we:

  • Get the value of said attribute
  • Record the attributes that were added to the element, using the value of the watched attribute as the key.
from djc_core_html_parser import set_html_attributes

html = """
  <div data-watch-id="123">
    <p data-watch-id="456">
      Hello
    </p>
  </div>
"""

result, captured = set_html_attributes(
  html,
  # Add attributes to the root elements
  root_attributes=['data-root-id'],
  # Add attributes to all elements
  all_attributes=['data-djc-tag'],
  # Watch for this attribute on elements
  watch_on_attribute='data-watch-id',
)

print(captured)
# {
#   '123': ['data-root-id', 'data-djc-tag'],
#   '456': ['data-djc-tag'],
# }

Development

  • Setup python env

    python -m venv .venv
    
  • Install dependencies

    pip install -r requirements-dev.txt
    

    The dev requirements also include maturin which is used packaging a Rust project as Python package.

  • Install Rust

    See https://www.rust-lang.org/tools/install

  • Run Rust tests

    cargo test
    
  • Build the Python package

    maturin develop
    

    To build the production-optimized package, use maturin develop --release.

  • Run Python tests

    pytest
    

    NOTE: When running Python tests, you need to run maturin develop first.

Deployment

Deployment is done automatically via GitHub Actions.

To publish a new version of the package, you need to:

  • Bump the version in pyproject.toml and Cargo.toml
  • Open a PR and merge it to main.
  • Create a new tag on the main branch with the new version number (e.g. v1.0.0), or create a new release in the GitHub UI.

Keywords

django

FAQs

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts