Security News
Research
Data Theft Repackaged: A Case Study in Malicious Wrapper Packages on npm
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
2024.11.08
:
pyright
users will see warning if a single string is supplied where collection of string is expected (tuple
, set
, list
etc). In terms of typing, a single str
itself is valid as a Sequence
, so type checkers normally would not raise alarm when using str
in such function parameters, but can induce unexpected runtime behavior. See #64 for more info. mypy
does not support this feature (which (ab)uses @deprecated
warning).2024.08.07
:
mypy 1.11
if one chooses to use mypy
for type checking. pyright
version requirement (1.1.351
) is not changed.This repository contains external type annotations for lxml
. It can be used by type-checking tools (currently supporting pyright
and mypy
) to check code that uses lxml
, or used within IDEs like VSCode to facilitate development.
Now the coverage of lxml
submodules is complete (unless intentionally rejected, see further below), thus no more considered as partial
:
lxml.etree
lxml.html
lxml.html.builder
lxml.html.clean
(already removed in lxml 5.2.0, this project will follow suite in future)lxml.html.diff
lxml.html.html5parser
lxml.html.soupparser
lxml.isoschematron
lxml.objectify
lxml.builder
lxml.cssselect
lxml.sax
lxml.ElementInclude
Following submodules will not be implemented due to irrelevance to type checking or other reasons:
lxml.etree.Schematron
(obsolete and superseded by lxml.isoschematron
)lxml.usedoctest
lxml.html.usedoctest
lxml.html.formfill
(shouldn't have existed, this would belong to HTTP libraries like requests
or httpx
)Check out project page for future plans and progress.
Currently the annotations are validated for both pyright
and mypy
, with pyright
recommended because of its greater flexibility and early adoption of newer type checking features.
In the future, there is plan to bring even more type checker support.
lxml-stubs
contributions are reviewed thoroughly, bringing coherency of annotation across the whole packageDespite having no official PEP, some IDEs support showing docstring from external annotations. This package tries to bring type annotation specific docstrings for some lxml
classes and functions, explaining how they can be used. Following screenshots show what would look like in Visual Studio Code, behaving as if docstrings come from real python code:
Besides docstring, current annotations are geared towards convenience for code writers instead of absolute logical 'correctness'. The deviation of class inheritance for HtmlComment
and friends is one prominent example.
The normal choice for most people is to fetch package from PyPI, like:
uv pip install -U types-lxml # using uv
pip install -U types-lxml # using pip
In the unlikely case PyPI is down, one can directly download wheel from latest release in GitHub, and then perform installation as local file.
As convenience, it is possible to pull type checker directly with extras:
uv pip install -U types-lxml[pyright]
pip install -U types-lxml[mypy]
Since 2024.08.07
release, there will be two versions of types-lxml
. First one is the default one; if there's no problem using it, there's no need to switch.
The second version, types-lxml-multi-subclass
, is intended for specific need, namely creation of multiple lxml element subclasses. For example:
graph TD;
etree.ElementBase-->MyBaseElement;
MyBaseElement-->MySubElement1;
MyBaseElement-->MySubElement2;
If a parsed or constructed element tree consists of single type of element nodes, it is safe to assume the children or parent of a node are of the same type too. But this assumption does not hold for multiple subclasses. Using diagram above as example, calling .iter()
method from MyBaseElement
node may produce element of any subelement or even MyBaseElement
itself.
Therefore output type should be simply MyBaseElement
only.
Such scenario is already in effect for lxml.html
. <form>
element (FormElement
) is supposed to contain other form related tags like <input>
, <select>
etc. But we can't possibly pinpoint single subelement type, so <form>
children can only possibly be of type HtmlElement
. The multiple subelement scenario is already hardcoded for HtmlElement
and ObjectifiedElement
within this annotation package, but users may choose to have their own overridden element subclasses (inherit from ElementBase
) too.
The 2 paradigms can't coexist within a single type annotation package. See bug #51 that illustrated why multiple build is necessary.
Remember that anybody can only choose one of the 2 builds. It is impossible to install both, as pip
just arbitrarily overwrite conflicting files with one another. If in doubt, removing existing package first, then install the one you needed.
Since 2024.11.08
users can download types-lxml
release files and verify that they indeed do originate from GitHub. For those haven't heard of it, this is sort of like gnupg
or minisign
signatures, but with GitHub backed infrastructure.
After downloading release wheel file (say pip download types-lxml
, or browser access to PyPI directly), one can use GitHub cli to verify it comes from this GitHub repository without being altered:
gh at verify types_lxml-2024.11.8-py3-none-any.whl --repo abelcheung/types-lxml
Should generate following result:
Loaded digest sha256:4b4fa7f9e2f1d5f58b98ac9852a75927e4e0f69363249f9cebc78db095c046e0 for file://types_lxml-2024.11.8-py3-none-any.whl
Loaded 1 attestation from GitHub API
✓ Verification succeeded!
sha256:4b4fa7f9e2f1d5f58b98ac9852a75927e4e0f69363249f9cebc78db095c046e0 was attested by:
REPO PREDICATE_TYPE WORKFLOW
abelcheung/types-lxml https://slsa.dev/provenance/v1 .github/workflows/release.yml@refs/tags/2024.11.08
Type annotations for lxml
were initially included in typeshed, but as it was still incomplete at that time, the stubs are ripped out as a separate project. The code was since then under governance of lxml, until 2022 when this fork intended to revamp lxml-stubs
completely and emerge into separate project.
types-lxml
is a fork of lxml-stubs
that strives for the goals described above, so that most people would find it more useful.
FAQs
Complete lxml external type annotation
We found that types-lxml demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Research
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Research
Security News
Attackers used a malicious npm package typosquatting a popular ESLint plugin to steal sensitive data, execute commands, and exploit developer systems.
Security News
The Ultralytics' PyPI Package was compromised four times in one weekend through GitHub Actions cache poisoning and failure to rotate previously compromised API tokens.