
Research
PyPI Package Disguised as Instagram Growth Tool Harvests User Credentials
A deceptive PyPI package posing as an Instagram growth tool collects user credentials and sends them to third-party bot services.
Facilities associated with binary data parsing and transcription. The classes in this module support easy parsing of binary data structures, returning instances with the binary data decoded into attributes and capable of transcribing themselves in binary form (trivially via `bytes(instance)` and also otherwise).
Facilities associated with binary data parsing and transcription.
The classes in this module support easy parsing of binary data
structures,
returning instances with the binary data decoded into attributes
and capable of transcribing themselves in binary form
(trivially via bytes(instance)
and also otherwise).
Latest release 20250501:
See cs.iso14496
for an ISO 14496 (eg MPEG4) parser
built using this module.
Note: this module requires Python 3.6+ because various default
behaviours rely on dict
s preserving their insert order.
Terminology used below:
cs.buffer.CornuCopyBuffer
,
which manages an iterable of bytes-like values
and has various useful methods for parsing.collections.abc.Buffer
;
almost always a bytes
instance or a memoryview
,
but in principle also things like bytearray
.The CornuCopyBuffer
is the basis for all parsing, as it manages
a variety of input sources such as files, memory, sockets etc.
It also has a factory methods to make one from a variety of sources
such as bytes, iterables, binary files, mmap
ped files,
TCP data streams, etc.
All the binary classes subclass AbstractBinary
,
Amongst other things, this means that the binary transcription
can be had simply from bytes(instance)
,
although there are more transcription methods provided
for when greater flexibility is desired.
It also means that all classes have parse
* and scan
* methods
for parsing binary data streams.
The .parse(cls,bfr)
class method reads binary data from a
buffer and returns an instance.
The .transcribe(self)
method may be a regular function or a
generator which returns or yields things which can be transcribed
as bytes via the flatten
function.
See the AbstractBinary.transcribe
docstring for specifics; this might:
bytes
AbstractBinary
instances such as each
field (which get transcribed in turn) or an iterable of these
thingsThere are 6 main ways an implementor might base their data structures:
BinaryStruct
: a factory for classes based
on a struct.struct
format string with multiple values;
this also builds a namedtuple
subclass@binclass
: a dataclass-like specification of a binary structureBinarySingleValue
: a base class for subclasses
parsing and transcribing a single value, such as UInt8
or
BinaryUTF8NUL
BinaryMultiValue
: a factory for subclasses
parsing and transcribing multiple values
with no variationSimpleBinary
: a base class for subclasses
with custom .parse
and .transcribe
methods,
for structures with variable fields;
this makes a SimpleNamespace
subclassThese can all be mixed as appropriate to your needs.
You can also instantiate objects directly; there's no requirement for the source information to be binary.
There are several presupplied subclasses for common basic types
such as UInt32BE
(an unsigned 32 bit big endian integer).
BinaryStruct
, from cs.iso14496
A simple struct
style definitiion for 9 longs:
Matrix9Long = BinaryStruct(
'Matrix9Long', '>lllllllll', 'v0 v1 v2 v3 v4 v5 v6 v7 v8'
)
Per the struct.struct
format string, this parses 9 big endian longs
and returns a namedtuple
with 9 fields.
Like all the AbstractBinary
subclasses, parsing an instance from a
stream can be done like this:
m9 = Matrix9Long.parse(bfr)
print("m9.v3", m9.v3)
and writing its binary form to a file like this:
f.write(bytes(m9))
@binclass
, also from cs.iso14496
For reasons to do with the larger MP4 parser this uses an extra
decorator @boxbodyclass
which is just a shim for the @binclass
decorator with an addition step.
@boxbodyclass
class FullBoxBody2(BoxBody):
""" A common extension of a basic `BoxBody`, with a version and flags field.
ISO14496 section 4.2.
"""
version: UInt8
flags0: UInt8
flags1: UInt8
flags2: UInt8
@property
def flags(self):
""" The flags value, computed from the 3 flag bytes.
"""
return (self.flags0 << 16) | (self.flags1 << 8) | self.flags2
This has 4 fields, each an unsigned 8 bit value (one bytes),
and a property .flags
which is the overall flags value for
the box header.
You should look at the source code for the TKHDBoxBody
from
that module for an example of a @binclass
with a variable
collection of fields based on an earlier version
field value.
BinarySingleValue
, the BSUInt
from thos moduleThe BSUint
transcribes an unsigned integera of arbitrary size
as a big endian variable sizes sequence of bytes.
I understand this is the same scheme MIDI uses.
You can define a BinarySingleValue
with conventional .parse()
and .transribe()
methods but it is usually expedient to instead
provide .parse_value()
and transcribe_value()
methods, which
return or transcibe the core value (the unsigned integer in
this case).
class BSUInt(BinarySingleValue, value_type=int):
""" A binary serialised unsigned `int`.
This uses a big endian byte encoding where continuation octets
have their high bit set. The bits contributing to the value
are in the low order 7 bits.
"""
@staticmethod
def parse_value(bfr: CornuCopyBuffer) -> int:
""" Parse an extensible byte serialised unsigned `int` from a buffer.
Continuation octets have their high bit set.
The value is big-endian.
This is the go for reading from a stream. If you already have
a bare bytes instance then the `.decode_bytes` static method
is probably most efficient;
there is of course the usual `AbstractBinary.parse_bytes`
but that constructs a buffer to obtain the individual bytes.
"""
n = 0
b = 0x80
while b & 0x80:
b = bfr.byte0()
n = (n << 7) | (b & 0x7f)
return n
# pylint: disable=arguments-renamed
@staticmethod
def transcribe_value(n):
""" Encode an unsigned int as an entensible byte serialised octet
sequence for decode. Return the bytes object.
"""
bs = [n & 0x7f]
n >>= 7
while n > 0:
bs.append(0x80 | (n & 0x7f))
n >>= 7
return bytes(reversed(bs))
BinaryMultiValue
A BinaryMultiValue
s a class factory for making a multi field
AbstractBinary
from variable field descriptions.
You're probably better off using @binclass
these days.
See the BinaryMutliValue
docstring for details and an example.
An MP4 ELST box:
class ELSTBoxBody(FullBoxBody):
""" An 'elst' Edit List FullBoxBody - section 8.6.6.
"""
V0EditEntry = BinaryStruct(
'ELSTBoxBody_V0EditEntry', '>Llhh',
'segment_duration media_time media_rate_integer media_rate_fraction'
)
V1EditEntry = BinaryStruct(
'ELSTBoxBody_V1EditEntry', '>Qqhh',
'segment_duration media_time media_rate_integer media_rate_fraction'
)
@property
def entry_class(self):
""" The class representing each entry.
"""
return self.V1EditEntry if self.version == 1 else self.V0EditEntry
@property
def entry_count(self):
""" The number of entries.
"""
return len(self.entries)
def parse_fields(self, bfr: CornuCopyBuffer):
""" Parse the fields of an `ELSTBoxBody`.
"""
super().parse_fields(bfr)
assert self.version in (0, 1)
entry_count = UInt32BE.parse_value(bfr)
self.entries = list(self.entry_class.scan(bfr, count=entry_count))
def transcribe(self):
""" Transcribe an `ELSTBoxBody`.
"""
yield super().transcribe()
yield UInt32BE.transcribe_value(self.entry_count)
yield map(self.entry_class.transcribe, self.entries)
A Edit List box comes in a version 0 and version 1 form, differing
in the field sizes in the edit entries. This defines two
flavours of edit entry structure and a property to return the
suitable class based on the version field. The parse_fields()
method is called from the base BoxBody
class' parse()
method
to collect addition fields for any box. For this box it collectsa
32 bit entry_count
and then a list of that many edit entries.
The transcription yields corresponding values.
Short summary:
AbstractBinary
: Abstract class for all Binary
* implementations, specifying the abstract parse
and transcribe
methods and providing various helper methods.BinaryBytes
: A list of bytes
parsed directly from the native iteration of the buffer. Subclasses are initialised with a consume=
class parameter indicating how many bytes to console on parse; the default is ...
meaning to consume the entire remaining buffer, but a positive integer can also be supplied to consume exactly that many bytes.BinaryFixedBytes
: Factory for an AbstractBinary
subclass matching length
bytes of data. The bytes are saved as the attribute .data
.BinaryListValues
: A list of values with a common parse specification, such as sample or Boxes in an ISO14496 Box structure.BinaryMultiStruct
: A class factory for AbstractBinary
namedtuple
subclasses built around potentially complex struct
formats.BinaryMultiValue
: Construct a SimpleBinary
subclass named class_name
whose fields are specified by the mapping field_map
.BinarySingleStruct
: OBSOLETE BinaryStruct.BinarySingleValue
: A representation of a single value as the attribute .value
.BinaryStruct
: OBSOLETE BinaryStruct.BinaryUTF16NUL
: A NUL terminated UTF-16 string.BinaryUTF8NUL
: A NUL terminated UTF-8 string.binclass
: A decorator for dataclass
-like binary classes.bs
: A bytes subclass with a compact
repr()`.BSData
: A run length encoded data chunk, with the length encoded as a BSUInt
.BSSFloat
: A float transcribed as a BSString
of str(float)
.BSString
: A run length encoded string, with the length encoded as a BSUInt.BSUInt
: A binary serialised unsigned int
.flatten
: Flatten transcription
into an iterable of Buffer
s. None of the Buffer
s will be empty.Float64BE
: An AbstractBinary
namedtuple
which parses and transcribes the struct format '>d'
and presents the attributes ['value'].Float64LE
: An AbstractBinary
namedtuple
which parses and transcribes the struct format '<d'
and presents the attributes ['value'].Int16BE
: An AbstractBinary
namedtuple
which parses and transcribes the struct format '>h'
and presents the attributes ['value'].Int16LE
: An AbstractBinary
namedtuple
which parses and transcribes the struct format '<h'
and presents the attributes ['value'].Int32BE
: An AbstractBinary
namedtuple
which parses and transcribes the struct format '>l'
and presents the attributes ['value'].Int32LE
: An AbstractBinary
namedtuple
which parses and transcribes the struct format '<l'
and presents the attributes ['value'].is_single_value
: Test whether obj
is a single value binary object.parse_offsets
: Decorate parse
(usually an AbstractBinary
class method) to record the buffer starting offset as self.offset
and the buffer post parse offset as self.end_offset
. If the decorator parameter report
is true, call bfr.report_offset()
with the starting offset at the end of the parse.pt_spec
: Convert a parse/transcribe specification pt
into an AbstractBinary
subclass.SimpleBinary
: Abstract binary class based on a SimpleNamespace
, thus providing a nice __str__
and a keyword based __init__
. Implementors must still define .parse
and .transcribe
.struct_field_types
: Construct a dict
mapping field names to struct return types.UInt16BE
: An AbstractBinary
namedtuple
which parses and transcribes the struct format '>H'
and presents the attributes ['value'].UInt16LE
: An AbstractBinary
namedtuple
which parses and transcribes the struct format '<H'
and presents the attributes ['value'].UInt32BE
: An AbstractBinary
namedtuple
which parses and transcribes the struct format '>L'
and presents the attributes ['value'].UInt32LE
: An AbstractBinary
namedtuple
which parses and transcribes the struct format '<L'
and presents the attributes ['value'].UInt64BE
: An AbstractBinary
namedtuple
which parses and transcribes the struct format '>Q'
and presents the attributes ['value'].UInt64LE
: An AbstractBinary
namedtuple
which parses and transcribes the struct format '<Q'
and presents the attributes ['value'].UInt8
: An AbstractBinary
namedtuple
which parses and transcribes the struct format 'B'
and presents the attributes ['value'].Module contents:
Class
AbstractBinary(cs.deco.Promotable): Abstract class for all
Binary* implementations, specifying the abstract
parseand
transcribe` methods
and providing various helper methods.
Naming conventions:
parse
* methods parse a single instance from a bufferscan
* methods are generators yielding successive instances from a bufferAbstractBinary.__bytes__(self)
:
The binary transcription as a single bytes
object.
AbstractBinary.__len__(self)
:
Compute the length by running a transcription and measuring it.
AbstractBinary.__str__(self, attr_names=None, attr_choose=None, str_func=None)
:
The string summary of this object.
If called explicitly rather than via str()
the following
optional parametsrs may be supplied:
attr_names
: an iterable of str
naming the attributes to include;
the default if the keys of self.__dict__
attr_choose
: a callable to select amongst the attribute names names;
the default is to choose names which do not start with an underscorestr_func
: a callable returning the string form of an attribute value;
the default returns cropped_repr(v)
where v
is the value's .value
attribute for single value objects otherwise the object itselfAbstractBinary.from_bytes(bs, **parse_bytes_kw)
:
Factory to parse an instance from the
bytes bs
starting at offset
.
Returns the new instance.
Raises ValueError
if bs
is not entirely consumed.
Raises EOFError
if bs
has insufficient data.
Keyword parameters are passed to the .parse_bytes
method.
This relies on the cls.parse
method for the parse.
AbstractBinary.load(f)
:
Load an instance from the file f
which may be a filename or an open file as for AbstractBinary.scan
.
Return the instance or None
if the file is empty.
AbstractBinary.parse(bfr: cs.buffer.CornuCopyBuffer)
:
Parse an instance of cls
from the buffer bfr
.
AbstractBinary.parse_bytes(bs, offset=0, length=None, **parse_kw)
:
Factory to parse an instance from the
bytes bs
starting at offset
.
Returns (instance,offset)
being the new instance and the post offset.
Raises EOFError
if bs
has insufficient data.
The parameters offset
and length
are passed to the
CornuCopyBuffer.from_bytes
factory.
Other keyword parameters are passed to the .parse
method.
This relies on the cls.parse
method for the parse.
AbstractBinary.save(self, f)
:
Save this instance to the file f
which may be a filename or an open file.
Return the length of the transcription.
AbstractBinary.scan(bfr: cs.buffer.CornuCopyBuffer, count=None, *, min_count=None, max_count=None, with_offsets=False, **parse_kw)
:
A generator to scan the buffer bfr
for repeated instances of cls
until end of input, and yield them.
Note that if bfr
is not already a CornuCopyBuffer
it is promoted to CornuCopyBuffer
from several types
such as filenames etc; see CornuCopyBuffer.promote
.
Parameters:
bfr
: the buffer to scan, or any object suitable for CornuCopyBuffer.promote
count
: the required number of instances to scan,
equivalent to setting min_count=count
and max_count=count
min_count
: the minimum number of instances to scanmax_count
: the maximum number of instances to scanwith_offsets
: optional flag, default False
;
if true yield (pre_offset,obj,post_offset)
, otherwise just obj
It is in error to specify both count
and one of min_count
or max_count
.Other keyword arguments are passed to self.parse()
.
Scanning stops after max_count
instances (if specified).
If fewer than min_count
instances (if specified) are scanned
a warning is issued.
This is to accomodate nonconformant streams without raising exceptions.
Callers wanting to validate max_count
may want to probe bfr.at_eof()
after return.
Callers not wanting a warning over min_count
should not specify it,
and instead check the number of instances returned themselves.
AbstractBinary.scan_fspath(fspath: str, *, with_offsets=False, **kw)
:
Open the file with filesystenm path fspath
for read
and yield from self.scan(..,**kw)
or
self.scan_with_offsets(..,**kw)
according to the
with_offsets
parameter.
Deprecated; please just call scan
with a filesystem pathname.
Parameters:
fspath
: the filesystem path of the file to scanwith_offsets
: optional flag, default False
;
if true then scan with scan_with_offsets
instead of
with scan
Other keyword parameters are passed to scan
or
scan_with_offsets
.AbstractBinary.scan_with_offsets(bfr: cs.buffer.CornuCopyBuffer, count=None, min_count=None, max_count=None)
:
Wrapper for scan()
which yields (pre_offset,instance,post_offset)
indicating the start and end offsets of the yielded instances.
All parameters are as for scan()
.
*Deprecated; please just call scan
with the with_offsets=True
parameter.
AbstractBinary.self_check(self, *, field_types=None)
:
Internal self check. Returns True
if passed.
If the structure has a FIELD_TYPES
attribute, normally a
class attribute, then check the fields against it.
The FIELD_TYPES
attribute is a mapping of field_name
to
a specification of required
and types
. The specification
may take one of 2 forms:
(required,types)
type
; this is equivalent to (True,(type,))
Their meanings are as follows:required
: a Boolean. If true, the field must be present
in the packet field_map
, otherwise it need not be present.types
: a tuple of acceptable field typesThere are some special semantics involved here.
An implementation of a structure may choose to make some
fields plain instance attributes instead of binary objects
in the field_map
mapping, particularly variable structures
such as a cs.iso14496.BoxHeader
, whose .length
may be parsed
directly from its binary form or computed from other fields
depending on the box_size
value. Therefore, checking for
a field is first done via the field_map
mapping, then by
getattr
, and as such the acceptable types
may include
nonstructure types such as int
.
Here is the cs.iso14496
Box.FIELD_TYPES
definition as an example:
FIELD_TYPES = {
'header': BoxHeader,
'body': BoxBody,
'unparsed': list,
'offset': int,
'unparsed_offset': int,
'end_offset': int,
}
Note that length
includes some nonstructure types,
and that it is written as a tuple of (True,types)
because
it has more than one acceptable type.
AbstractBinary.transcribe(self)
:
Return or yield bytes
, ASCII string, None
or iterables
comprising the binary form of this instance.
This aims for maximum convenience when transcribing a data structure.
This may be implemented as a generator, yielding parts of the structure.
This may be implemented as a normal function, returning:
None
: no bytes of data,
for example for an omitted or empty structurebytes
-like object: the full data bytes for the structure'ascii'
encoding to make bytes
None
, bytes
-like objects,
ASCII compatible strings or iterables.
This supports directly returning or yielding the result of a field's
.transcribe
method.AbstractBinary.transcribe_flat(self)
:
Return a flat iterable of chunks transcribing this field.
AbstractBinary.transcribed_length(self)
:
Compute the length by running a transcription and measuring it.
AbstractBinary.write(self, file, *, flush=False)
:
Write this instance to file
, a file-like object supporting
.write(bytes)
and .flush()
.
Return the number of bytes written.
Class
BinaryBytes(BinarySingleValue): A list of
bytesparsed directly from the native iteration of the buffer. Subclasses are initialised with a
consume=class parameter indicating how many bytes to console on parse; the default is
...` meaning to consume the entire remaining buffer, but
a positive integer can also be supplied to consume exactly
that many bytes.BinaryBytes.parse(bfr: cs.buffer.CornuCopyBuffer)
:
Consume cls.PARSE_SIZE
bytes from the buffer and instantiate a new instance.
BinaryBytes.promote(obj)
:
Promote obj
to a BinaryBytes
instance.
Other instances of AbstractBinary
will be transcribed into the buffers.
Otherwise use BinarySingleValue.promote(obj)
.
BinaryBytes.transcribe(self)
:
Transcribe each value.
BinaryBytes.value
:
The internal list of bytes
instances joined together.
This is a property and may be expensive to compute for a large list.
BinaryFixedBytes(class_name: str, length: int)
: Factory for an AbstractBinary
subclass matching length
bytes of data.
The bytes are saved as the attribute .data
.Class
BinaryListValues(AbstractBinary)`: A list of values with a common parse specification,
such as sample or Boxes in an ISO14496 Box structure.BinaryListValues.parse(bfr: cs.buffer.CornuCopyBuffer, count=None, *, end_offset=None, min_count=None, max_count=None, pt)
:
Read values from bfr
.
Return a BinaryListValue
containing the values.
Parameters:
count
: optional count of values to read;
if specified, exactly this many values are expected.end_offset
: an optional bounding end offset of the buffer.min_count
: the least acceptable number of values.max_count
: the most acceptable number of values.pt
: a parse/transcribe specification
as accepted by the pt_spec()
factory.
The values will be returned by its parse function.BinaryListValues.transcribe(self)
:
Transcribe all the values.
BinaryMultiStruct(class_name: str, struct_format: str, field_names: Union[str, List[str]] = 'value')
: A class factory for AbstractBinary
namedtuple
subclasses
built around potentially complex struct
formats.
Parameters:
class_name
: name for the generated classstruct_format
: the struct
format stringfield_names
: optional field name list,
a space separated string or an interable of strings;
the default is 'value'
, intended for single field structsExample:
# an "access point" record from the .ap file
Enigma2APInfo = BinaryStruct('Enigma2APInfo', '>QQ', 'pts offset')
# a "cut" record from the .cuts file
Enigma2Cut = BinaryStruct('Enigma2Cut', '>QL', 'pts type')
>>> UInt16BE = BinaryStruct('UInt16BE', '>H')
>>> UInt16BE.__name__
'UInt16BE'
>>> UInt16BE.format
'>H'
>>> UInt16BE.struct #doctest: +ELLIPSIS
<_struct.Struct object at ...>
>>> field = UInt16BE.from_bytes(bytes((2,3)))
>>> field
UInt16BE('>H',value=515)
>>> field.value
515
BinaryMultiValue(class_name, field_map, field_order=None)
: Construct a SimpleBinary
subclass named class_name
whose fields are specified by the mapping field_map
.
The field_map
is a mapping of field name
to parse/trasncribe specifications suitable for pt_spec()
;
these are all promoted by pt_spec
into AbstractBinary
subclasses.
The field_order
is an optional ordering of the field names;
the default comes from the iteration order of field_map
.
Note for Python <3.6:
if field_order
is not specified
it is constructed by iterating over field_map
.
Prior to Python 3.6, dict
s do not provide a reliable order
and should be accompanied by an explicit field_order
.
From 3.6 onward a dict
is enough and its insertion order
will dictate the default field_order
.
For a fixed record structure
the default .parse
and .transcribe
methods will suffice;
they parse or transcribe each field in turn.
Subclasses with variable records should override
the .parse
and .transcribe
methods
accordingly.
If the class has both parse_value
and transcribe_value
methods
then the value itself will be directly stored.
Otherwise the class it presumed to be more complex subclass
of AbstractBinary
and the instance is stored.
Here is an example exhibiting various ways of defining each field:
n1
: defined with the *_value
methods of UInt8
,
which return or transcribe the int
from an unsigned 8 bit value;
this stores a BinarySingleValue
whose .value
is an int
n2
: defined from the UInt8
class,
which parses an unsigned 8 bit value;
this stores an UInt8
instance
(also a BinarySingleValue
whole .value
is an int
)
n3
: like n2
data1
: defined with the *_value
methods of BSData
,
which return or transcribe the data bytes
from a run length encoded data chunk;
this stores a BinarySingleValue
whose .value
is a bytes
data2
: defined from the BSData
class
which parses a run length encoded data chunk;
this is a BinarySingleValue
so we store its bytes
value directly.
>>> class BMV(BinaryMultiValue("BMV", {
... 'n1': (UInt8.parse_value, UInt8.transcribe_value),
... 'n2': UInt8,
... 'n3': UInt8,
... 'nd': ('>H4s', 'short bs'),
... 'data1': (
... BSData.parse_value,
... BSData.transcribe_value,
... ),
... 'data2': BSData,
... })):
... pass
>>> BMV.FIELD_ORDER
['n1', 'n2', 'n3', 'nd', 'data1', 'data2']
>>> bmv = BMV.from_bytes(b'\x11\x22\x77\x81\x82zyxw\x02AB\x04DEFG')
>>> bmv.n1 #doctest: +ELLIPSIS
17
>>> bmv.n2
34
>>> bmv #doctest: +ELLIPSIS
BMV(n1=17, n2=34, n3=119, nd=nd('>H4s',short=33154,bs=b'zyxw'), data1=b'AB', data2=b'DEFG')
>>> bmv.nd #doctest: +ELLIPSIS
nd('>H4s',short=33154,bs=b'zyxw')
>>> bmv.nd.bs
b'zyxw'
>>> bytes(bmv.nd)
b'zyxw'
>>> bmv.data1
b'AB'
>>> bmv.data2
b'DEFG'
>>> bytes(bmv)
b'\x11"w\x81\x82zyxw\x02AB\x04DEFG'
>>> list(bmv.transcribe_flat())
[b'\x11', b'"', b'w', b'\x81\x82zyxw', b'\x02', b'AB', b'\x04', b'DEFG']
BinarySingleStruct(class_name: str, struct_format: str, field_names: Union[str, List[str]] = 'value')
: OBSOLETE BinaryStruct
A class factory for AbstractBinary
namedtuple
subclasses
built around potentially complex struct
formats.
Parameters:
class_name
: name for the generated classstruct_format
: the struct
format stringfield_names
: optional field name list,
a space separated string or an interable of strings;
the default is 'value'
, intended for single field structsExample:
# an "access point" record from the .ap file
Enigma2APInfo = BinaryStruct('Enigma2APInfo', '>QQ', 'pts offset')
# a "cut" record from the .cuts file
Enigma2Cut = BinaryStruct('Enigma2Cut', '>QL', 'pts type')
>>> UInt16BE = BinaryStruct('UInt16BE', '>H')
>>> UInt16BE.__name__
'UInt16BE'
>>> UInt16BE.format
'>H'
>>> UInt16BE.struct #doctest: +ELLIPSIS
<_struct.Struct object at ...>
>>> field = UInt16BE.from_bytes(bytes((2,3)))
>>> field
UInt16BE('>H',value=515)
>>> field.value
515
Class
BinarySingleValue(AbstractBinary): A representation of a single value as the attribute
.value`.
Subclasses must implement:
parse
or parse_value
transcribe
or transcribe_value
BinarySingleValue.__init__(self, value)
:
Initialise self
with value
.
BinarySingleValue.parse(bfr: cs.buffer.CornuCopyBuffer)
:
Parse an instance from bfr
.
Subclasses must implement this method or parse_value
.
BinarySingleValue.parse_value(bfr: cs.buffer.CornuCopyBuffer)
:
Parse a value from bfr
based on this class.
Subclasses must implement this method or parse
.
BinarySingleValue.parse_value_from_bytes(bs, offset=0, length=None, **kw)
:
Parse a value from the bytes bs
based on this class.
Return (value,offset)
.
BinarySingleValue.scan_values(bfr: cs.buffer.CornuCopyBuffer, **kw)
:
Scan bfr
, yield values.
BinarySingleValue.transcribe(self)
:
Transcribe this instance as bytes.
Subclasses must implement this method or transcribe_value
.
BinarySingleValue.transcribe_value(value)
:
Transcribe value
as bytes based on this class.
Subclasses must implement this method or transcribe
.
BinarySingleValue.value_from_bytes(bs, **from_bytes_kw)
:
Decode an instance from bs
using .from_bytes
and return the .value
attribute.
Keyword arguments are passed to cls.from_bytes
.
BinaryStruct(class_name: str, struct_format: str, field_names: Union[str, List[str]] = 'value')
: OBSOLETE BinaryStruct
OBSOLETE BinaryStruct
A class factory for AbstractBinary
namedtuple
subclasses
built around potentially complex struct
formats.
Parameters:
class_name
: name for the generated classstruct_format
: the struct
format stringfield_names
: optional field name list,
a space separated string or an interable of strings;
the default is 'value'
, intended for single field structsExample:
# an "access point" record from the .ap file
Enigma2APInfo = BinaryStruct('Enigma2APInfo', '>QQ', 'pts offset')
# a "cut" record from the .cuts file
Enigma2Cut = BinaryStruct('Enigma2Cut', '>QL', 'pts type')
>>> UInt16BE = BinaryStruct('UInt16BE', '>H')
>>> UInt16BE.__name__
'UInt16BE'
>>> UInt16BE.format
'>H'
>>> UInt16BE.struct #doctest: +ELLIPSIS
<_struct.Struct object at ...>
>>> field = UInt16BE.from_bytes(bytes((2,3)))
>>> field
UInt16BE('>H',value=515)
>>> field.value
515
Class
BinaryUTF16NUL(BinarySingleValue)`: A NUL terminated UTF-16 string.
BinaryUTF16NUL.__init__(self, value: str, *, encoding: str)
:
pylint: disable=super-init-not-called
BinaryUTF16NUL.VALUE_TYPE
BinaryUTF16NUL.parse(bfr: cs.buffer.CornuCopyBuffer, *, encoding: str)
:
Parse the encoding and value and construct an instance.
BinaryUTF16NUL.parse_value(bfr: cs.buffer.CornuCopyBuffer, *, encoding: str) -> str
:
Read a NUL terminated UTF-16 string from bfr
, return a UTF16NULField
.
The mandatory parameter encoding
specifies the UTF16 encoding to use
('utf_16_be'
or 'utf_16_le'
).
BinaryUTF16NUL.transcribe(self)
:
Transcribe self.value
in UTF-16 with a terminating NUL.
BinaryUTF16NUL.transcribe_value(value: str, encoding='utf-16')
:
Transcribe value
in UTF-16 with a terminating NUL.
BinaryUTF8NUL.VALUE_TYPE
BinaryUTF8NUL.parse_value(bfr: cs.buffer.CornuCopyBuffer) -> str
:
Read a NUL terminated UTF-8 string from bfr
, return field.
BinaryUTF8NUL.transcribe_value(s)
:
Transcribe the value
in UTF-8 with a terminating NUL.
binclass(*da, **dkw)
: A decorator for dataclass
-like binary classes.
Example use:
>>> @binclass
... class SomeStruct:
... """A struct containing a count and some flags."""
... count : UInt32BE
... flags : UInt8
>>> ss = SomeStruct(count=3, flags=0x04)
>>> ss
SomeStruct:SomeStruct__dataclass(count=UInt32BE('>L',value=3),flags=UInt8('B',value=4))
>>> print(ss)
SomeStruct(count=3,flags=4)
>>> bytes(ss)
b'\x00\x00\x00\x03\x04'
>>> SomeStruct.promote(b'\x00\x00\x00\x03\x04')
SomeStruct:SomeStruct__dataclass(count=UInt32BE('>L',value=3),flags=UInt8('B',value=4))
Extending an existing @binclass
class, for example to add
the body of a structure to some header part:
>>> @binclass
... class HeaderStruct:
... """A header containing a count and some flags."""
... count : UInt32BE
... flags : UInt8
>>> @binclass
... class Packet(HeaderStruct):
... body_text : BSString
... body_data : BSData
... body_longs : BinaryStruct(
... 'longs', '>LL', 'long1 long2'
... )
>>> packet = Packet(
... count=5, flags=0x03,
... body_text="hello",
... body_data=b'xyzabc',
... body_longs=(10,20),
... )
>>> packet
Packet:Packet__dataclass(count=UInt32BE('>L',value=5),flags=UInt8('B',value=3),body_text=BSString('hello'),body_data=BSData(b'xyzabc'),body_longs=longs('>LL',long1=10,long2=20))
>>> print(packet)
Packet(count=5,flags=3,body_text=hello,body_data=b'xyzabc',body_longs=longs(long1=10,long2=20))
>>> packet.body_data
b'xyzabc'
Class
bs(builtins.bytes): A
bytes subclass with a compact repr()
.
bs.join(self, chunks)
:
bytes.join
but returning a bs
.
bs.promote(obj)
:
Promote bytes
or memoryview
to a bs
.
Class
BSData(BinarySingleValue): A run length encoded data chunk, with the length encoded as a
BSUInt`.BSData.data
:
An alias for the .value
attribute.
BSData.data_offset
:
The length of the length indicator,
useful for computing the location of the raw data.
BSData.data_offset_for(bs) -> int
:
Compute the data_offset
which would obtain for the bytes bs
.
BSData.parse_value(bfr: cs.buffer.CornuCopyBuffer) -> bytes
:
Parse the data from bfr
.
BSData.transcribe_value(data)
:
Transcribe the payload length and then the payload.
BSSFloat.VALUE_TYPE
BSSFloat.parse_value(bfr: cs.buffer.CornuCopyBuffer) -> float
:
Parse a BSSFloat
from a buffer and return the float
.
BSSFloat.transcribe_value(f)
:
Transcribe a float
.
Class
BSString(BinarySingleValue)`: A run length encoded string, with the length encoded as a BSUInt.BSString.VALUE_TYPE
BSString.parse_value(bfr: cs.buffer.CornuCopyBuffer, encoding='utf-8', errors='strict') -> str
:
Parse a run length encoded string from bfr
.
BSString.transcribe_value(value: str, encoding='utf-8')
:
Transcribe a string.
Class
BSUInt(BinarySingleValue): A binary serialised unsigned
int`.
This uses a big endian byte encoding where continuation octets have their high bit set. The bits contributing to the value are in the low order 7 bits.
BSUInt.VALUE_TYPE
BSUInt.decode_bytes(data, offset=0) -> Tuple[int, int]
:
Decode an extensible byte serialised unsigned int
from data
at offset
.
Return value and new offset.
Continuation octets have their high bit set. The octets are big-endian.
If you just have a bytes
instance, this is the go. If you're
reading from a stream you're better off with parse
or parse_value
.
Examples:
>>> BSUInt.decode_bytes(b'\0')
(0, 1)
Note: there is of course the usual AbstractBinary.parse_bytes
but that constructs a buffer to obtain the individual bytes;
this static method will be more performant
if all you are doing is reading this serialisation
and do not already have a buffer.
BSUInt.parse_value(bfr: cs.buffer.CornuCopyBuffer) -> int
:
Parse an extensible byte serialised unsigned int
from a buffer.
Continuation octets have their high bit set. The value is big-endian.
This is the go for reading from a stream. If you already have
a bare bytes instance then the .decode_bytes
static method
is probably most efficient;
there is of course the usual AbstractBinary.parse_bytes
but that constructs a buffer to obtain the individual bytes.
BSUInt.transcribe_value(n)
:
Encode an unsigned int as an entensible byte serialised octet
sequence for decode. Return the bytes object.
flatten(transcription) -> Iterable[bytes]
: Flatten transcription
into an iterable of Buffer
s.
None of the Buffer
s will be empty.
This exists to allow subclass methods to easily return
transcribable things (having a .transcribe
method), ASCII
strings or bytes or iterables or even None
, in turn allowing
them simply to return their superclass' chunks iterators
directly instead of having to unpack them.
The supplied transcription
may be any of the following:
None
: yield nothing.transcribe
method: yield from
flatten(transcription.transcribe())
Buffer
: yield the Buffer
if it is not emptystr
: yield transcription.encode('ascii')
flatten(item)
for each item in transcription
An example from the cs.iso14496.METABoxBody
class:
def transcribe(self):
yield super().transcribe()
yield self.theHandler
yield self.boxes
The binary classes flatten
the result of the .transcribe
method to obtain bytes
instances for the object's binary
transcription.
Class
Float64BE(Float64BE, AbstractBinary): An
AbstractBinary
namedtuplewhich parses and transcribes the struct format
'>d'` and presents the attributes ['value'].
Float64BE.parse(bfr: cs.buffer.CornuCopyBuffer)
:
Parse from bfr
via struct.unpack
.
Float64BE.parse_value(bfr: cs.buffer.CornuCopyBuffer) -> float
:
Parse a value from bfr
, return the value.
Float64BE.promote(obj)
:
Promote a single value to an instance of cls
.
Float64BE.transcribe(self)
:
Transcribe via struct.pack
.
Float64BE.transcribe_value(value)
:
Transcribe a value back into bytes.
Class
Float64LE(Float64LE, AbstractBinary): An
AbstractBinary
namedtuplewhich parses and transcribes the struct format
'<d'` and presents the attributes ['value'].Float64LE.parse(bfr: cs.buffer.CornuCopyBuffer)
:
Parse from bfr
via struct.unpack
.
Float64LE.parse_value(bfr: cs.buffer.CornuCopyBuffer) -> float
:
Parse a value from bfr
, return the value.
Float64LE.promote(obj)
:
Promote a single value to an instance of cls
.
Float64LE.transcribe(self)
:
Transcribe via struct.pack
.
Float64LE.transcribe_value(value)
:
Transcribe a value back into bytes.
Class
Int16BE(Int16BE, AbstractBinary): An
AbstractBinary
namedtuplewhich parses and transcribes the struct format
'>h'` and presents the attributes ['value'].Int16BE.parse(bfr: cs.buffer.CornuCopyBuffer)
:
Parse from bfr
via struct.unpack
.
Int16BE.parse_value(bfr: cs.buffer.CornuCopyBuffer) -> int
:
Parse a value from bfr
, return the value.
Int16BE.promote(obj)
:
Promote a single value to an instance of cls
.
Int16BE.transcribe(self)
:
Transcribe via struct.pack
.
Int16BE.transcribe_value(value)
:
Transcribe a value back into bytes.
Class
Int16LE(Int16LE, AbstractBinary): An
AbstractBinary
namedtuplewhich parses and transcribes the struct format
'<h'` and presents the attributes ['value'].Int16LE.parse(bfr: cs.buffer.CornuCopyBuffer)
:
Parse from bfr
via struct.unpack
.
Int16LE.parse_value(bfr: cs.buffer.CornuCopyBuffer) -> int
:
Parse a value from bfr
, return the value.
Int16LE.promote(obj)
:
Promote a single value to an instance of cls
.
Int16LE.transcribe(self)
:
Transcribe via struct.pack
.
Int16LE.transcribe_value(value)
:
Transcribe a value back into bytes.
Class
Int32BE(Int32BE, AbstractBinary): An
AbstractBinary
namedtuplewhich parses and transcribes the struct format
'>l'` and presents the attributes ['value'].Int32BE.parse(bfr: cs.buffer.CornuCopyBuffer)
:
Parse from bfr
via struct.unpack
.
Int32BE.parse_value(bfr: cs.buffer.CornuCopyBuffer) -> int
:
Parse a value from bfr
, return the value.
Int32BE.promote(obj)
:
Promote a single value to an instance of cls
.
Int32BE.transcribe(self)
:
Transcribe via struct.pack
.
Int32BE.transcribe_value(value)
:
Transcribe a value back into bytes.
Class
Int32LE(Int32LE, AbstractBinary): An
AbstractBinary
namedtuplewhich parses and transcribes the struct format
'<l'` and presents the attributes ['value'].Int32LE.parse(bfr: cs.buffer.CornuCopyBuffer)
:
Parse from bfr
via struct.unpack
.
Int32LE.parse_value(bfr: cs.buffer.CornuCopyBuffer) -> int
:
Parse a value from bfr
, return the value.
Int32LE.promote(obj)
:
Promote a single value to an instance of cls
.
Int32LE.transcribe(self)
:
Transcribe via struct.pack
.
Int32LE.transcribe_value(value)
:
Transcribe a value back into bytes.
is_single_value(obj)
: Test whether obj
is a single value binary object.
This currently recognises BinarySingleValue
instances
and tuple based AbstractBinary
instances of length 1.
parse_offsets(*da, **dkw)
: Decorate parse
(usually an AbstractBinary
class method)
to record the buffer starting offset as self.offset
and the buffer post parse offset as self.end_offset
.
If the decorator parameter report
is true,
call bfr.report_offset()
with the starting offset at the end of the parse.
pt_spec(pt, name=None, value_type=None, as_repr=None, as_str=None)
: Convert a parse/transcribe specification pt
into an AbstractBinary
subclass.
This is largely used to provide flexibility
in the specifications for the BinaryMultiValue
factory
but can also be used as a factory for other simple classes.
If the specification pt
is a subclass of AbstractBinary
this is returned directly.
If pt
is a (str,str) 2-tuple
the values are presumed to be a format string for struct.struct
and field names separated by spaces;
a new BinaryStruct
class is created from these and returned.
Otherwise two functions
f_parse_value(bfr)
and f_transcribe_value(value)
are obtained and used to construct a new BinarySingleValue
class
as follows:
If pt
has .parse_value
and .transcribe_value
callable attributes,
use those for f_parse_value
and f_transcribe_value
respectively.
Otherwise, if pt
is an int
define f_parse_value
to obtain exactly that many bytes from a buffer
and f_transcribe_value
to return those bytes directly.
Otherwise presume pt
is a 2-tuple of (f_parse_value,f_transcribe_value)
.
Class
SimpleBinary(types.SimpleNamespace, AbstractBinary): Abstract binary class based on a
SimpleNamespace, thus providing a nice
strand a keyword based
init. Implementors must still define
.parseand
.transcribe`.
To constrain the arguments passed to __init__
,
define an __init__
which accepts specific keyword arguments
and pass through to super().__init__()
. Example:
def __init__(self, *, field1=None, field2):
""" Accept only `field1` (optional)
and `field2` (mandatory).
"""
super().__init__(field1=field1, field2=field2)
struct_field_types(struct_format: str, field_names: Union[str, Iterable[str]]) -> Mapping[str, type]
: Construct a dict
mapping field names to struct return types.
Example:
>>> struct_field_types('>Hs', 'count text_bs')
{'count': <class 'int'>, 'text_bs': <class 'bytes'>}
Class
UInt16BE(UInt16BE, AbstractBinary): An
AbstractBinary
namedtuplewhich parses and transcribes the struct format
'>H'` and presents the attributes ['value'].
UInt16BE.parse(bfr: cs.buffer.CornuCopyBuffer)
:
Parse from bfr
via struct.unpack
.
UInt16BE.parse_value(bfr: cs.buffer.CornuCopyBuffer) -> int
:
Parse a value from bfr
, return the value.
UInt16BE.promote(obj)
:
Promote a single value to an instance of cls
.
UInt16BE.transcribe(self)
:
Transcribe via struct.pack
.
UInt16BE.transcribe_value(value)
:
Transcribe a value back into bytes.
Class
UInt16LE(UInt16LE, AbstractBinary): An
AbstractBinary
namedtuplewhich parses and transcribes the struct format
'<H'` and presents the attributes ['value'].UInt16LE.parse(bfr: cs.buffer.CornuCopyBuffer)
:
Parse from bfr
via struct.unpack
.
UInt16LE.parse_value(bfr: cs.buffer.CornuCopyBuffer) -> int
:
Parse a value from bfr
, return the value.
UInt16LE.promote(obj)
:
Promote a single value to an instance of cls
.
UInt16LE.transcribe(self)
:
Transcribe via struct.pack
.
UInt16LE.transcribe_value(value)
:
Transcribe a value back into bytes.
Class
UInt32BE(UInt32BE, AbstractBinary): An
AbstractBinary
namedtuplewhich parses and transcribes the struct format
'>L'` and presents the attributes ['value'].UInt32BE.parse(bfr: cs.buffer.CornuCopyBuffer)
:
Parse from bfr
via struct.unpack
.
UInt32BE.parse_value(bfr: cs.buffer.CornuCopyBuffer) -> int
:
Parse a value from bfr
, return the value.
UInt32BE.promote(obj)
:
Promote a single value to an instance of cls
.
UInt32BE.transcribe(self)
:
Transcribe via struct.pack
.
UInt32BE.transcribe_value(value)
:
Transcribe a value back into bytes.
Class
UInt32LE(UInt32LE, AbstractBinary): An
AbstractBinary
namedtuplewhich parses and transcribes the struct format
'<L'` and presents the attributes ['value'].UInt32LE.parse(bfr: cs.buffer.CornuCopyBuffer)
:
Parse from bfr
via struct.unpack
.
UInt32LE.parse_value(bfr: cs.buffer.CornuCopyBuffer) -> int
:
Parse a value from bfr
, return the value.
UInt32LE.promote(obj)
:
Promote a single value to an instance of cls
.
UInt32LE.transcribe(self)
:
Transcribe via struct.pack
.
UInt32LE.transcribe_value(value)
:
Transcribe a value back into bytes.
Class
UInt64BE(UInt64BE, AbstractBinary): An
AbstractBinary
namedtuplewhich parses and transcribes the struct format
'>Q'` and presents the attributes ['value'].UInt64BE.parse(bfr: cs.buffer.CornuCopyBuffer)
:
Parse from bfr
via struct.unpack
.
UInt64BE.parse_value(bfr: cs.buffer.CornuCopyBuffer) -> int
:
Parse a value from bfr
, return the value.
UInt64BE.promote(obj)
:
Promote a single value to an instance of cls
.
UInt64BE.transcribe(self)
:
Transcribe via struct.pack
.
UInt64BE.transcribe_value(value)
:
Transcribe a value back into bytes.
Class
UInt64LE(UInt64LE, AbstractBinary): An
AbstractBinary
namedtuplewhich parses and transcribes the struct format
'<Q'` and presents the attributes ['value'].UInt64LE.parse(bfr: cs.buffer.CornuCopyBuffer)
:
Parse from bfr
via struct.unpack
.
UInt64LE.parse_value(bfr: cs.buffer.CornuCopyBuffer) -> int
:
Parse a value from bfr
, return the value.
UInt64LE.promote(obj)
:
Promote a single value to an instance of cls
.
UInt64LE.transcribe(self)
:
Transcribe via struct.pack
.
UInt64LE.transcribe_value(value)
:
Transcribe a value back into bytes.
Class
UInt8(UInt8, AbstractBinary): An
AbstractBinary
namedtuplewhich parses and transcribes the struct format
'B'` and presents the attributes ['value'].UInt8.parse(bfr: cs.buffer.CornuCopyBuffer)
:
Parse from bfr
via struct.unpack
.
UInt8.parse_value(bfr: cs.buffer.CornuCopyBuffer) -> int
:
Parse a value from bfr
, return the value.
UInt8.promote(obj)
:
Promote a single value to an instance of cls
.
UInt8.transcribe(self)
:
Transcribe via struct.pack
.
UInt8.transcribe_value(value)
:
Transcribe a value back into bytes.
Release 20250501:
Release 20240630:
Release 20240422: New _BinaryMultiValue_Base.for_json() method returning a dict containing the fields.
Release 20240316: Fixed release upload artifacts.
Release 20240201: BREAKING CHANGE: drop the long deprecated PacketField related classes.
Release 20231129: BinaryMultiStruct.parse: promote the buffer arguments to a CornuCopyBuffer.
Release 20230401:
bfr
parameter may be any object acceptable to CornuCopyBuffer.promote.Release 20230212:
Release 20221206: Documentation fix.
Release 20220605: BinaryMixin: replace scan_file with scan_fspath, as the former left uncertainty about the amount of the file consumed.
Release 20210316:
Release 20210306: MAJOR RELEASE: The PacketField classes and friends were hard to use; this release supplied a suite of easier to use and more consistent Binary* classes, and ports most of those things based on the old scheme to the new scheme.
Release 20200229:
.length
attribute to struct based packet classes providing the data length of the structure (struct.Struct.size).add_deferred_field
method to consume the raw data for a field for parsing later (done automatically if the attribute is accessed).@deferred_field
decorator for the parser for that stashed data.Release 20191230.3: Docstring tweak.
Release 20191230.2: Documentation updates.
Release 20191230.1: Docstring updates. Semantic changes were in the previous release.
Release 20191230:
skip_fields
parameter to omit some field names..transcribe_value
method which makes a new instance and calls its .transcribe
method.Release 20190220:
Release 20181231: flatten: do not yield zero length bytelike objects, can be misread as EOF on some streams.
Release 20181108:
.value
attribute until end of input.Release 20180823:
Release 20180810.2: Documentation improvements.
Release 20180810.1: Improve module description.
Release 20180810: BytesesField.from_buffer: make use of the buffer's skipto method if discard_data is true.
Release 20180805:
Release 20180801: Initial PyPI release.
FAQs
Facilities associated with binary data parsing and transcription. The classes in this module support easy parsing of binary data structures, returning instances with the binary data decoded into attributes and capable of transcribing themselves in binary form (trivially via `bytes(instance)` and also otherwise).
We found that cs-binary demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
A deceptive PyPI package posing as an Instagram growth tool collects user credentials and sends them to third-party bot services.
Product
Socket now supports pylock.toml, enabling secure, reproducible Python builds with advanced scanning and full alignment with PEP 751's new standard.
Security News
Research
Socket uncovered two npm packages that register hidden HTTP endpoints to delete all files on command.