urlstd

urlstd
is a Python implementation of the WHATWG URL Living Standard.
This library provides URL
class, URLSearchParams
class, and low-level APIs that comply with the URL specification.
Supported APIs
Basic Usage
To parse a string into a URL
:
from urlstd.parse import URL
URL('http://user:pass@foo:21/bar;par?b#c')
To parse a string into a URL
with using a base URL:
url = URL('?ffi&🌈', base='http://example.org')
url
url.search
params = url.search_params
params
params.sort()
params
url.search
str(url)
To validate a URL string:
from urlstd.parse import URL, URLValidator, ValidityState
URL.can_parse('https://user:password@example.org/')
URLValidator.is_valid('https://user:password@example.org/')
validity = ValidityState()
URLValidator.is_valid('https://user:password@example.org/', validity=validity)
validity.valid
validity.validation_errors
validity.descriptions[0]
URL.can_parse('file:///C|/demo')
URLValidator.is_valid('file:///C|/demo')
validity = ValidityState()
URLValidator.is_valid('file:///C|/demo', validity=validity)
validity.valid
validity.validation_errors
validity.descriptions[0]
To parse a string into a urllib.parse.ParseResult
with using a base URL:
import html
from urllib.parse import unquote
from urlstd.parse import urlparse
pr = urlparse('?aÿb', base='http://example.org/foo/', encoding='utf-8')
pr
unquote(pr.query)
pr = urlparse('?aÿb', base='http://example.org/foo/', encoding='windows-1251')
pr
unquote(pr.query, encoding='windows-1251')
html.unescape('aÿb')
pr = urlparse('?aÿb', base='http://example.org/foo/', encoding='windows-1252')
pr
unquote(pr.query, encoding='windows-1252')
Logging
urlstd
uses standard library logging for validation error.
Change the logger log level of urlstd
if needed:
logging.getLogger('urlstd').setLevel(logging.ERROR)
Dependencies
Installation
Running Tests
Install dependencies:
pipx install tox
pip install --user tox
To run tests and generate a report:
git clone https://github.com/miute/urlstd.git
cd urlstd
tox -e wpt
See result: tests/wpt/report.html
License
MIT License.