
Security News
Meet Socket at Black Hat and DEF CON 2025 in Las Vegas
Meet Socket at Black Hat & DEF CON 2025 for 1:1s, insider security talks at Allegiant Stadium, and a private dinner with top minds in software supply chain security.
Python integer subclass to handle arithmetic and formatting of integers with data size units
Python integer subclass to handle arithmetic and formatting of integers with data size units
Provides parsing, arithmetic and comparison oprations, and formatting of human readable data size strings for logic that depends on comparisons of values given in common units of data allocation. There are other solutions, but they are either not complete, or too heavy or awkward for casual use. A string like "14GiB" is really an integer representing a data allocation.
The basic use case is to be able to parse a string containing a common expression of data size with a numeric value and a unit of data. The resulting object is actually an integer count of bytes, so that it can be used in any arithmetic expression. That integer can be expressed, using Python 3 string formatting, as any other unit of data. This allows, for example, configuration files that support a natural way of expressing and operating on quantities of data.
DataSize supports metric and IEC units in both bits and bytes and nonstandard abbreviated IEC units (for legacy Java -Xmx). There is support for variable word-lengths, but because I thought it would get confusing, converting between two different word lengths is not supported. The word length constructor keyword argument will allow converting counts of weird (actually non-byte) word or symbol bit lengths to bit rates, which can then be explicitly converted to standard 8-bit bytes.
The really sweet feature that everyone should love is the Python string.format() support!
Help on method __format__ in module datasize.DataSize:
__format__(self, code) unbound datasize.__datasize__.DataSize method
formats as a decimal number, but recognizes data units as type format codes.
Precision is ignored for integer multiples of the unit specified in the format code.
format codes:
a autoformat will choose a unit defaulting to the largest
size with a quantity >= 1 (default)
A abbreviated number of bytes (implied IEC units, and implied 'B' bytes suffix omitted)
B bytes (1)
kiB kibibytes (1024)
kB kilobytes (1000)
...
GiB Gibibytes (1024**3)
GB Gigabytes (10**9)
...
YiB Yobibytes (1024**8)
YB Yottabytes (10**24)
>>> from datasize import DataSize
>>> 'My new {:GB} SSD really only stores {:.2GiB} of data.'.format(DataSize('750GB'),DataSize(DataSize('750GB') * 0.8))
'My new 750GB SSD really only stores 558.79GiB of data.'
FAQs
Python integer subclass to handle arithmetic and formatting of integers with data size units
We found that datasize demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Meet Socket at Black Hat & DEF CON 2025 for 1:1s, insider security talks at Allegiant Stadium, and a private dinner with top minds in software supply chain security.
Security News
CAI is a new open source AI framework that automates penetration testing tasks like scanning and exploitation up to 3,600× faster than humans.
Security News
Deno 2.4 brings back bundling, improves dependency updates and telemetry, and makes the runtime more practical for real-world JavaScript projects.