
Research
/Security News
Contagious Interview Campaign Escalates With 67 Malicious npm Packages and New Malware Loader
North Korean threat actors deploy 67 malicious npm packages using the newly discovered XORIndex malware loader.
.. image:: https://img.shields.io/pypi/v/python_hll.svg :target: https://pypi.python.org/pypi/python_hll
.. image:: https://readthedocs.org/projects/python-hll/badge/?version=latest :target: https://python-hll.readthedocs.io/en/latest/?badge=latest :alt: Documentation Status
.. image:: https://img.shields.io/badge/github-python--hll-yellow :target: https://github.com/AdRoll/python-hll
A Python implementation of HyperLogLog <http://algo.inria.fr/flajolet/Publications/FlFuGaMe07.pdf>
_
whose goal is to be storage compatible <https://github.com/aggregateknowledge/hll-storage-spec>
_
with java-hll <https://github.com/aggregateknowledge/java-hll>
, js-hll <https://github.com/aggregateknowledge/js-hll>
and postgresql-hll <https://github.com/citusdata/postgresql-hll>
_.
NOTE: This is a fairly literal translation/port of java-hll <https://github.com/aggregateknowledge/java-hll>
_
to Python. Internally, bytes are represented as Java-style bytes (-128 to 127) rather than Python-style bytes (0 to 255).
Also this implementation is quite slow: for example, in Java HLLSerializationTest
takes 12 seconds to run
while in Python test_hll_serialization
takes 1.5 hours to run (about 400x slower).
See java-hll <https://github.com/aggregateknowledge/java-hll>
_ for an overview of what HLLs are and how they work.
Hashing and adding a value to a new HLL::
from python_hll.hll import HLL
import mmh3
value_to_hash = 'foo'
hashed_value = mmh3.hash(value_to_hash)
hll = HLL(13, 5) # log2m=13, regwidth=5
hll.add_raw(hashed_value)
Retrieving the cardinality of an HLL::
cardinality = hll.cardinality()
Unioning two HLLs together (and retrieving the resulting cardinality)::
hll1 = HLL(13, 5) # log2m=13, regwidth=5
hll2 = HLL(13, 5) # log2m=13, regwidth=5
# ... (add values to both sets) ...
hll1.union(hll2) # modifies hll1 to contain the union
cardinalityUnion = hll1.cardinality()
Reading an HLL from a hex representation of
storage specification, v1.0.0 <https://github.com/aggregateknowledge/hll-storage-spec/blob/v1.0.0/STORAGE.md>
_
(for example, retrieved from a PostgreSQL database <https://github.com/aggregateknowledge/postgresql-hll>
_)::
from python_hll.util import NumberUtil
input = '\\x128D7FFFFFFFFFF6A5C420'
hex_string = input[2:]
hll = HLL.from_bytes(NumberUtil.from_hex(hex_string, 0, len(hex_string)))
Writing an HLL to its hex representation of
storage specification, v1.0.0 <https://github.com/aggregateknowledge/hll-storage-spec/blob/v1.0.0/STORAGE.md>
_
(for example, to be inserted into a PostgreSQL database <https://github.com/aggregateknowledge/postgresql-hll>
_)::
bytes = hll.to_bytes()
output = "\\x" + NumberUtil.to_hex(bytes, 0, len(bytes))
Also see the API documentation <https://python-hll.readthedocs.io/en/latest/docs/python_hll.html>
_.
See Contributing <https://python-hll.readthedocs.io/en/latest/contributing.html>
_ for how to get started building, testing, and deploying the code.
FAQs
Python library for the HyperLogLog algorithm
We found that python-hll demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
/Security News
North Korean threat actors deploy 67 malicious npm packages using the newly discovered XORIndex malware loader.
Security News
Meet Socket at Black Hat & DEF CON 2025 for 1:1s, insider security talks at Allegiant Stadium, and a private dinner with top minds in software supply chain security.
Security News
CAI is a new open source AI framework that automates penetration testing tasks like scanning and exploitation up to 3,600× faster than humans.