Product
Introducing SSO
Streamline your login process and enhance security by enabling Single Sign-On (SSO) on the Socket platform, now available for all customers on the Enterprise plan, supporting 20+ identity providers.
Readme
urlparse4
is a performance-focused replacement for Python's
urlparse
module, using C++ code from Chromium's own URL parser.
It is not production-ready yet.
Many credits go to
gurl-cython <https://github.com/Preetwinder/gurl-cython>
__ for
inspiration.
urlparse
urlparse4
should be a transparent, drop-in replacement in almost all
cases. Still, there are a few differences to be aware of:
urlparse4
is 2-7x faster for most operations (see benchmarks
below)urlparse4
currently doesn't pass CPython's test_urlparse.py
suite due to edge cases that Chromium's parser manages differently
(usually in accordance to the RFCs, which urlparse
doesn't follow
entirely).urlparse4
only supports Python 2.7 for nowYou must have Docker installed and running. You can run CPython's test
suite for urlparse
like this:
::
make docker_build
make docker_test
We are testing the following librairies on a sample of 100k URLs from Blink and DMOZ:
CPython's urlparse <https://github.com/python/cpython/blob/2.7/Lib/urlparse.py>
__urlparse2 <https://github.com/mwhooker/urlparse2>
__YURL <http://github.com/homm/yurl/>
__uritools <https://github.com/tkem/uritools>
__pygurl / gurl-cython <https://github.com/Preetwinder/gurl-cython>
__cyuri <https://github.com/mitghi/cyuri>
__Each of them is being tested on a few different types of operations (basic urlsplit, relative link resolution, hostname extraction)
Here is how to launch the tests:
::
make docker_build
make docker_benchmark
Current results on a 2.2GHz Intel Core i7 MBP (in seconds):
::
Benchmark results on 104300 URLs x 10 times, in seconds:
Name Sum Mean Median 90%
---------------- ------------- ----------------- ----------------- -----------------
urlsplit:
---- ---- ---- ---- ----
urlparse4 1.681858 1.61251965484e-06 1.99999999984e-06 2.00000000006e-06
pygurl 2.031712 1.94795014382e-06 1.99999999984e-06 2.00000000028e-06
uritools 2.638991 2.53019271333e-06 2.00000000028e-06 3.00000000042e-06
yurl 3.910247 3.74903835091e-06 3.00000000131e-06 4.99999999981e-06
urlparse2 3.756782 3.60190028763e-06 2.99999999953e-06 4.00000000056e-06
urlparse 3.862006 3.70278619367e-06 3.00000000308e-06 4.99999999803e-06
cyuri 9.912275 9.50361936721e-06 8.00000000112e-06 1.30000000027e-05
urljoin_sibling:
---- ---- ---- ---- ----
urlparse4 2.008453 1.92565004794e-06 2.00000000206e-06 2.00000000206e-06
pygurl 2.193427 2.10299808245e-06 2.00000000206e-06 2.99999999953e-06
uritools 10.575344 1.01393518696e-05 9.99999999607e-06 1.20000000052e-05
yurl 13.213052 1.26683144775e-05 1.19999999981e-05 1.60000000022e-05
urlparse2 14.239327 1.36522790029e-05 1.19999999981e-05 1.69999999997e-05
urlparse 9.25991500001 8.87815436242e-06 8.00000000822e-06 1.10000000006e-05
cyuri 5.742724 5.50596740172e-06 5.00000000159e-06 7.00000001075e-06
hostname:
---- ---- ---- ---- ----
urlparse4 1.883982 1.80631064237e-06 1.99999999495e-06 2.00000000916e-06
pygurl 1.67332099999 1.60433461169e-06 1.99999999495e-06 2.00000000916e-06
uritools 3.31632199999 3.17959923297e-06 3.00000000664e-06 4.00000000411e-06
yurl 3.853319 3.69445733461e-06 3.00000000664e-06 4.00000000411e-06
urlparse2 4.641513 4.45015627996e-06 4.00000000411e-06 5.99999999906e-06
urlparse 5.122682 4.91148801534e-06 4.00000000411e-06 5.99999999906e-06
cyuri 11.108649 1.06506701822e-05 9.0000000057e-06 1.5999999988e-05
Some libraries are included in the benchmark code but disabled for various reasons:
urlparse3 <https://pypi.python.org/pypi/urlparse3/>
__ (Raises on
valid URLs)slimurl <https://github.com/mosquito/slimurl>
__ (Too slow)Feel free to submit pull requests to add new ones!
We'd love to hear your feedback! Feel free to look at the issues on GitHub and open new ones if needed :)
FAQs
Performance-focused replacement for Python's urlparse module
We found that urlparse4 demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Product
Streamline your login process and enhance security by enabling Single Sign-On (SSO) on the Socket platform, now available for all customers on the Enterprise plan, supporting 20+ identity providers.
Security News
Tea.xyz, a crypto project aimed at rewarding open source contributions, is once again facing backlash due to an influx of spam packages flooding public package registries.
Security News
As cyber threats become more autonomous, AI-powered defenses are crucial for businesses to stay ahead of attackers who can exploit software vulnerabilities at scale.