
Product
Introducing the Alert Details Page: A Better Way to Explore Alerts
Socket's new Alert Details page is designed to surface more context, with a clearer layout, reachability dependency chains, and structured review.
cs-binary
Advanced tools
Facilities associated with binary data parsing and transcription. The classes in this module support easy parsing of binary data structures, returning instances with the binary data decoded into attributes and capable of transcribing themselves in binary form (trivially via `bytes(instance)` and also otherwise).
Facilities associated with binary data parsing and transcription.
The classes in this module support easy parsing of binary data
structures,
returning instances with the binary data decoded into attributes
and capable of transcribing themselves in binary form
(trivially via bytes(instance) and also otherwise).
Latest release 20250501:
See cs.iso14496 for an ISO 14496 (eg MPEG4) parser
built using this module.
Note: this module requires Python 3.6+ because various default
behaviours rely on dicts preserving their insert order.
Terminology used below:
cs.buffer.CornuCopyBuffer,
which manages an iterable of bytes-like values
and has various useful methods for parsing.collections.abc.Buffer;
almost always a bytes instance or a memoryview,
but in principle also things like bytearray.The CornuCopyBuffer is the basis for all parsing, as it manages
a variety of input sources such as files, memory, sockets etc.
It also has a factory methods to make one from a variety of sources
such as bytes, iterables, binary files, mmapped files,
TCP data streams, etc.
All the binary classes subclass AbstractBinary,
Amongst other things, this means that the binary transcription
can be had simply from bytes(instance),
although there are more transcription methods provided
for when greater flexibility is desired.
It also means that all classes have parse* and scan* methods
for parsing binary data streams.
The .parse(cls,bfr) class method reads binary data from a
buffer and returns an instance.
The .transcribe(self) method may be a regular function or a
generator which returns or yields things which can be transcribed
as bytes via the flatten function.
See the AbstractBinary.transcribe docstring for specifics; this might:
bytesAbstractBinary instances such as each
field (which get transcribed in turn) or an iterable of these
thingsThere are 6 main ways an implementor might base their data structures:
BinaryStruct: a factory for classes based
on a struct.struct format string with multiple values;
this also builds a namedtuple subclass@binclass: a dataclass-like specification of a binary structureBinarySingleValue: a base class for subclasses
parsing and transcribing a single value, such as UInt8 or
BinaryUTF8NULBinaryMultiValue: a factory for subclasses
parsing and transcribing multiple values
with no variationSimpleBinary: a base class for subclasses
with custom .parse and .transcribe methods,
for structures with variable fields;
this makes a SimpleNamespace subclassThese can all be mixed as appropriate to your needs.
You can also instantiate objects directly; there's no requirement for the source information to be binary.
There are several presupplied subclasses for common basic types
such as UInt32BE (an unsigned 32 bit big endian integer).
BinaryStruct, from cs.iso14496A simple struct style definitiion for 9 longs:
Matrix9Long = BinaryStruct(
'Matrix9Long', '>lllllllll', 'v0 v1 v2 v3 v4 v5 v6 v7 v8'
)
Per the struct.struct format string, this parses 9 big endian longs
and returns a namedtuple with 9 fields.
Like all the AbstractBinary subclasses, parsing an instance from a
stream can be done like this:
m9 = Matrix9Long.parse(bfr)
print("m9.v3", m9.v3)
and writing its binary form to a file like this:
f.write(bytes(m9))
@binclass, also from cs.iso14496For reasons to do with the larger MP4 parser this uses an extra
decorator @boxbodyclass which is just a shim for the @binclass
decorator with an addition step.
@boxbodyclass
class FullBoxBody2(BoxBody):
""" A common extension of a basic `BoxBody`, with a version and flags field.
ISO14496 section 4.2.
"""
version: UInt8
flags0: UInt8
flags1: UInt8
flags2: UInt8
@property
def flags(self):
""" The flags value, computed from the 3 flag bytes.
"""
return (self.flags0 << 16) | (self.flags1 << 8) | self.flags2
This has 4 fields, each an unsigned 8 bit value (one bytes),
and a property .flags which is the overall flags value for
the box header.
You should look at the source code for the TKHDBoxBody from
that module for an example of a @binclass with a variable
collection of fields based on an earlier version field value.
BinarySingleValue, the BSUInt from thos moduleThe BSUint transcribes an unsigned integera of arbitrary size
as a big endian variable sizes sequence of bytes.
I understand this is the same scheme MIDI uses.
You can define a BinarySingleValue with conventional .parse()
and .transribe() methods but it is usually expedient to instead
provide .parse_value() and transcribe_value() methods, which
return or transcibe the core value (the unsigned integer in
this case).
class BSUInt(BinarySingleValue, value_type=int):
""" A binary serialised unsigned `int`.
This uses a big endian byte encoding where continuation octets
have their high bit set. The bits contributing to the value
are in the low order 7 bits.
"""
@staticmethod
def parse_value(bfr: CornuCopyBuffer) -> int:
""" Parse an extensible byte serialised unsigned `int` from a buffer.
Continuation octets have their high bit set.
The value is big-endian.
This is the go for reading from a stream. If you already have
a bare bytes instance then the `.decode_bytes` static method
is probably most efficient;
there is of course the usual `AbstractBinary.parse_bytes`
but that constructs a buffer to obtain the individual bytes.
"""
n = 0
b = 0x80
while b & 0x80:
b = bfr.byte0()
n = (n << 7) | (b & 0x7f)
return n
# pylint: disable=arguments-renamed
@staticmethod
def transcribe_value(n):
""" Encode an unsigned int as an entensible byte serialised octet
sequence for decode. Return the bytes object.
"""
bs = [n & 0x7f]
n >>= 7
while n > 0:
bs.append(0x80 | (n & 0x7f))
n >>= 7
return bytes(reversed(bs))
BinaryMultiValueA BinaryMultiValue s a class factory for making a multi field
AbstractBinary from variable field descriptions.
You're probably better off using @binclass these days.
See the BinaryMutliValue docstring for details and an example.
An MP4 ELST box:
class ELSTBoxBody(FullBoxBody):
""" An 'elst' Edit List FullBoxBody - section 8.6.6.
"""
V0EditEntry = BinaryStruct(
'ELSTBoxBody_V0EditEntry', '>Llhh',
'segment_duration media_time media_rate_integer media_rate_fraction'
)
V1EditEntry = BinaryStruct(
'ELSTBoxBody_V1EditEntry', '>Qqhh',
'segment_duration media_time media_rate_integer media_rate_fraction'
)
@property
def entry_class(self):
""" The class representing each entry.
"""
return self.V1EditEntry if self.version == 1 else self.V0EditEntry
@property
def entry_count(self):
""" The number of entries.
"""
return len(self.entries)
def parse_fields(self, bfr: CornuCopyBuffer):
""" Parse the fields of an `ELSTBoxBody`.
"""
super().parse_fields(bfr)
assert self.version in (0, 1)
entry_count = UInt32BE.parse_value(bfr)
self.entries = list(self.entry_class.scan(bfr, count=entry_count))
def transcribe(self):
""" Transcribe an `ELSTBoxBody`.
"""
yield super().transcribe()
yield UInt32BE.transcribe_value(self.entry_count)
yield map(self.entry_class.transcribe, self.entries)
A Edit List box comes in a version 0 and version 1 form, differing
in the field sizes in the edit entries. This defines two
flavours of edit entry structure and a property to return the
suitable class based on the version field. The parse_fields()
method is called from the base BoxBody class' parse() method
to collect addition fields for any box. For this box it collectsa
32 bit entry_count and then a list of that many edit entries.
The transcription yields corresponding values.
Short summary:
AbstractBinary: Abstract class for all Binary* implementations, specifying the abstract parse and transcribe methods and providing various helper methods.BinaryBytes: A list of bytes parsed directly from the native iteration of the buffer. Subclasses are initialised with a consume= class parameter indicating how many bytes to console on parse; the default is ... meaning to consume the entire remaining buffer, but a positive integer can also be supplied to consume exactly that many bytes.BinaryFixedBytes: Factory for an AbstractBinary subclass matching length bytes of data. The bytes are saved as the attribute .data.BinaryListValues: A list of values with a common parse specification, such as sample or Boxes in an ISO14496 Box structure.BinaryMultiStruct: A class factory for AbstractBinary namedtuple subclasses built around potentially complex struct formats.BinaryMultiValue: Construct a SimpleBinary subclass named class_name whose fields are specified by the mapping field_map.BinarySingleStruct: OBSOLETE BinaryStruct.BinarySingleValue: A representation of a single value as the attribute .value.BinaryStruct: OBSOLETE BinaryStruct.BinaryUTF16NUL: A NUL terminated UTF-16 string.BinaryUTF8NUL: A NUL terminated UTF-8 string.binclass: A decorator for dataclass-like binary classes.bs: A bytes subclass with a compact repr()`.BSData: A run length encoded data chunk, with the length encoded as a BSUInt.BSSFloat: A float transcribed as a BSString of str(float).BSString: A run length encoded string, with the length encoded as a BSUInt.BSUInt: A binary serialised unsigned int.flatten: Flatten transcription into an iterable of Buffers. None of the Buffers will be empty.Float64BE: An AbstractBinary namedtuple which parses and transcribes the struct format '>d' and presents the attributes ['value'].Float64LE: An AbstractBinary namedtuple which parses and transcribes the struct format '<d' and presents the attributes ['value'].Int16BE: An AbstractBinary namedtuple which parses and transcribes the struct format '>h' and presents the attributes ['value'].Int16LE: An AbstractBinary namedtuple which parses and transcribes the struct format '<h' and presents the attributes ['value'].Int32BE: An AbstractBinary namedtuple which parses and transcribes the struct format '>l' and presents the attributes ['value'].Int32LE: An AbstractBinary namedtuple which parses and transcribes the struct format '<l' and presents the attributes ['value'].is_single_value: Test whether obj is a single value binary object.parse_offsets: Decorate parse (usually an AbstractBinary class method) to record the buffer starting offset as self.offset and the buffer post parse offset as self.end_offset. If the decorator parameter report is true, call bfr.report_offset() with the starting offset at the end of the parse.pt_spec: Convert a parse/transcribe specification pt into an AbstractBinary subclass.SimpleBinary: Abstract binary class based on a SimpleNamespace, thus providing a nice __str__ and a keyword based __init__. Implementors must still define .parse and .transcribe.struct_field_types: Construct a dict mapping field names to struct return types.UInt16BE: An AbstractBinary namedtuple which parses and transcribes the struct format '>H' and presents the attributes ['value'].UInt16LE: An AbstractBinary namedtuple which parses and transcribes the struct format '<H' and presents the attributes ['value'].UInt32BE: An AbstractBinary namedtuple which parses and transcribes the struct format '>L' and presents the attributes ['value'].UInt32LE: An AbstractBinary namedtuple which parses and transcribes the struct format '<L' and presents the attributes ['value'].UInt64BE: An AbstractBinary namedtuple which parses and transcribes the struct format '>Q' and presents the attributes ['value'].UInt64LE: An AbstractBinary namedtuple which parses and transcribes the struct format '<Q' and presents the attributes ['value'].UInt8: An AbstractBinary namedtuple which parses and transcribes the struct format 'B' and presents the attributes ['value'].Module contents:
Class AbstractBinary(cs.deco.Promotable): Abstract class for all Binary* implementations, specifying the abstract parseandtranscribe` methods
and providing various helper methods.
Naming conventions:
parse* methods parse a single instance from a bufferscan* methods are generators yielding successive instances from a bufferAbstractBinary.__bytes__(self):
The binary transcription as a single bytes object.
AbstractBinary.__len__(self):
Compute the length by running a transcription and measuring it.
AbstractBinary.__str__(self, attr_names=None, attr_choose=None, str_func=None):
The string summary of this object.
If called explicitly rather than via str() the following
optional parametsrs may be supplied:
attr_names: an iterable of str naming the attributes to include;
the default if the keys of self.__dict__attr_choose: a callable to select amongst the attribute names names;
the default is to choose names which do not start with an underscorestr_func: a callable returning the string form of an attribute value;
the default returns cropped_repr(v) where v is the value's .value
attribute for single value objects otherwise the object itselfAbstractBinary.from_bytes(bs, **parse_bytes_kw):
Factory to parse an instance from the
bytes bs starting at offset.
Returns the new instance.
Raises ValueError if bs is not entirely consumed.
Raises EOFError if bs has insufficient data.
Keyword parameters are passed to the .parse_bytes method.
This relies on the cls.parse method for the parse.
AbstractBinary.load(f):
Load an instance from the file f
which may be a filename or an open file as for AbstractBinary.scan.
Return the instance or None if the file is empty.
AbstractBinary.parse(bfr: cs.buffer.CornuCopyBuffer):
Parse an instance of cls from the buffer bfr.
AbstractBinary.parse_bytes(bs, offset=0, length=None, **parse_kw):
Factory to parse an instance from the
bytes bs starting at offset.
Returns (instance,offset) being the new instance and the post offset.
Raises EOFError if bs has insufficient data.
The parameters offset and length are passed to the
CornuCopyBuffer.from_bytes factory.
Other keyword parameters are passed to the .parse method.
This relies on the cls.parse method for the parse.
AbstractBinary.save(self, f):
Save this instance to the file f
which may be a filename or an open file.
Return the length of the transcription.
AbstractBinary.scan(bfr: cs.buffer.CornuCopyBuffer, count=None, *, min_count=None, max_count=None, with_offsets=False, **parse_kw):
A generator to scan the buffer bfr for repeated instances of cls
until end of input, and yield them.
Note that if bfr is not already a CornuCopyBuffer
it is promoted to CornuCopyBuffer from several types
such as filenames etc; see CornuCopyBuffer.promote.
Parameters:
bfr: the buffer to scan, or any object suitable for CornuCopyBuffer.promotecount: the required number of instances to scan,
equivalent to setting min_count=count and max_count=countmin_count: the minimum number of instances to scanmax_count: the maximum number of instances to scanwith_offsets: optional flag, default False;
if true yield (pre_offset,obj,post_offset), otherwise just obj
It is in error to specify both count and one of min_count or max_count.Other keyword arguments are passed to self.parse().
Scanning stops after max_count instances (if specified).
If fewer than min_count instances (if specified) are scanned
a warning is issued.
This is to accomodate nonconformant streams without raising exceptions.
Callers wanting to validate max_count may want to probe bfr.at_eof()
after return.
Callers not wanting a warning over min_count should not specify it,
and instead check the number of instances returned themselves.
AbstractBinary.scan_fspath(fspath: str, *, with_offsets=False, **kw):
Open the file with filesystenm path fspath for read
and yield from self.scan(..,**kw) or
self.scan_with_offsets(..,**kw) according to the
with_offsets parameter.
Deprecated; please just call scan with a filesystem pathname.
Parameters:
fspath: the filesystem path of the file to scanwith_offsets: optional flag, default False;
if true then scan with scan_with_offsets instead of
with scan
Other keyword parameters are passed to scan or
scan_with_offsets.AbstractBinary.scan_with_offsets(bfr: cs.buffer.CornuCopyBuffer, count=None, min_count=None, max_count=None):
Wrapper for scan() which yields (pre_offset,instance,post_offset)
indicating the start and end offsets of the yielded instances.
All parameters are as for scan().
*Deprecated; please just call scan with the with_offsets=True parameter.
AbstractBinary.self_check(self, *, field_types=None):
Internal self check. Returns True if passed.
If the structure has a FIELD_TYPES attribute, normally a
class attribute, then check the fields against it.
The FIELD_TYPES attribute is a mapping of field_name to
a specification of required and types. The specification
may take one of 2 forms:
(required,types)type; this is equivalent to (True,(type,))
Their meanings are as follows:required: a Boolean. If true, the field must be present
in the packet field_map, otherwise it need not be present.types: a tuple of acceptable field typesThere are some special semantics involved here.
An implementation of a structure may choose to make some
fields plain instance attributes instead of binary objects
in the field_map mapping, particularly variable structures
such as a cs.iso14496.BoxHeader, whose .length may be parsed
directly from its binary form or computed from other fields
depending on the box_size value. Therefore, checking for
a field is first done via the field_map mapping, then by
getattr, and as such the acceptable types may include
nonstructure types such as int.
Here is the cs.iso14496 Box.FIELD_TYPES definition as an example:
FIELD_TYPES = {
'header': BoxHeader,
'body': BoxBody,
'unparsed': list,
'offset': int,
'unparsed_offset': int,
'end_offset': int,
}
Note that length includes some nonstructure types,
and that it is written as a tuple of (True,types) because
it has more than one acceptable type.
AbstractBinary.transcribe(self):
Return or yield bytes, ASCII string, None or iterables
comprising the binary form of this instance.
This aims for maximum convenience when transcribing a data structure.
This may be implemented as a generator, yielding parts of the structure.
This may be implemented as a normal function, returning:
None: no bytes of data,
for example for an omitted or empty structurebytes-like object: the full data bytes for the structure'ascii' encoding to make bytesNone, bytes-like objects,
ASCII compatible strings or iterables.
This supports directly returning or yielding the result of a field's
.transcribe method.AbstractBinary.transcribe_flat(self):
Return a flat iterable of chunks transcribing this field.
AbstractBinary.transcribed_length(self):
Compute the length by running a transcription and measuring it.
AbstractBinary.write(self, file, *, flush=False):
Write this instance to file, a file-like object supporting
.write(bytes) and .flush().
Return the number of bytes written.
Class BinaryBytes(BinarySingleValue): A list of bytesparsed directly from the native iteration of the buffer. Subclasses are initialised with aconsume=class parameter indicating how many bytes to console on parse; the default is...` meaning to consume the entire remaining buffer, but
a positive integer can also be supplied to consume exactly
that many bytes.BinaryBytes.parse(bfr: cs.buffer.CornuCopyBuffer):
Consume cls.PARSE_SIZE bytes from the buffer and instantiate a new instance.
BinaryBytes.promote(obj):
Promote obj to a BinaryBytes instance.
Other instances of AbstractBinary will be transcribed into the buffers.
Otherwise use BinarySingleValue.promote(obj).
BinaryBytes.transcribe(self):
Transcribe each value.
BinaryBytes.value:
The internal list of bytes instances joined together.
This is a property and may be expensive to compute for a large list.
BinaryFixedBytes(class_name: str, length: int): Factory for an AbstractBinary subclass matching length bytes of data.
The bytes are saved as the attribute .data.Class BinaryListValues(AbstractBinary)`: A list of values with a common parse specification,
such as sample or Boxes in an ISO14496 Box structure.BinaryListValues.parse(bfr: cs.buffer.CornuCopyBuffer, count=None, *, end_offset=None, min_count=None, max_count=None, pt):
Read values from bfr.
Return a BinaryListValue containing the values.
Parameters:
count: optional count of values to read;
if specified, exactly this many values are expected.end_offset: an optional bounding end offset of the buffer.min_count: the least acceptable number of values.max_count: the most acceptable number of values.pt: a parse/transcribe specification
as accepted by the pt_spec() factory.
The values will be returned by its parse function.BinaryListValues.transcribe(self):
Transcribe all the values.
BinaryMultiStruct(class_name: str, struct_format: str, field_names: Union[str, List[str]] = 'value'): A class factory for AbstractBinary namedtuple subclasses
built around potentially complex struct formats.
Parameters:
class_name: name for the generated classstruct_format: the struct format stringfield_names: optional field name list,
a space separated string or an interable of strings;
the default is 'value', intended for single field structsExample:
# an "access point" record from the .ap file
Enigma2APInfo = BinaryStruct('Enigma2APInfo', '>QQ', 'pts offset')
# a "cut" record from the .cuts file
Enigma2Cut = BinaryStruct('Enigma2Cut', '>QL', 'pts type')
>>> UInt16BE = BinaryStruct('UInt16BE', '>H')
>>> UInt16BE.__name__
'UInt16BE'
>>> UInt16BE.format
'>H'
>>> UInt16BE.struct #doctest: +ELLIPSIS
<_struct.Struct object at ...>
>>> field = UInt16BE.from_bytes(bytes((2,3)))
>>> field
UInt16BE('>H',value=515)
>>> field.value
515
BinaryMultiValue(class_name, field_map, field_order=None): Construct a SimpleBinary subclass named class_name
whose fields are specified by the mapping field_map.
The field_map is a mapping of field name
to parse/trasncribe specifications suitable for pt_spec();
these are all promoted by pt_spec into AbstractBinary subclasses.
The field_order is an optional ordering of the field names;
the default comes from the iteration order of field_map.
Note for Python <3.6:
if field_order is not specified
it is constructed by iterating over field_map.
Prior to Python 3.6, dicts do not provide a reliable order
and should be accompanied by an explicit field_order.
From 3.6 onward a dict is enough and its insertion order
will dictate the default field_order.
For a fixed record structure
the default .parse and .transcribe methods will suffice;
they parse or transcribe each field in turn.
Subclasses with variable records should override
the .parse and .transcribe methods
accordingly.
If the class has both parse_value and transcribe_value methods
then the value itself will be directly stored.
Otherwise the class it presumed to be more complex subclass
of AbstractBinary and the instance is stored.
Here is an example exhibiting various ways of defining each field:
n1: defined with the *_value methods of UInt8,
which return or transcribe the int from an unsigned 8 bit value;
this stores a BinarySingleValue whose .value is an int
n2: defined from the UInt8 class,
which parses an unsigned 8 bit value;
this stores an UInt8 instance
(also a BinarySingleValue whole .value is an int)
n3: like n2
data1: defined with the *_value methods of BSData,
which return or transcribe the data bytes
from a run length encoded data chunk;
this stores a BinarySingleValue whose .value is a bytes
data2: defined from the BSData class
which parses a run length encoded data chunk;
this is a BinarySingleValue so we store its bytes value directly.
>>> class BMV(BinaryMultiValue("BMV", {
... 'n1': (UInt8.parse_value, UInt8.transcribe_value),
... 'n2': UInt8,
... 'n3': UInt8,
... 'nd': ('>H4s', 'short bs'),
... 'data1': (
... BSData.parse_value,
... BSData.transcribe_value,
... ),
... 'data2': BSData,
... })):
... pass
>>> BMV.FIELD_ORDER
['n1', 'n2', 'n3', 'nd', 'data1', 'data2']
>>> bmv = BMV.from_bytes(b'\x11\x22\x77\x81\x82zyxw\x02AB\x04DEFG')
>>> bmv.n1 #doctest: +ELLIPSIS
17
>>> bmv.n2
34
>>> bmv #doctest: +ELLIPSIS
BMV(n1=17, n2=34, n3=119, nd=nd('>H4s',short=33154,bs=b'zyxw'), data1=b'AB', data2=b'DEFG')
>>> bmv.nd #doctest: +ELLIPSIS
nd('>H4s',short=33154,bs=b'zyxw')
>>> bmv.nd.bs
b'zyxw'
>>> bytes(bmv.nd)
b'zyxw'
>>> bmv.data1
b'AB'
>>> bmv.data2
b'DEFG'
>>> bytes(bmv)
b'\x11"w\x81\x82zyxw\x02AB\x04DEFG'
>>> list(bmv.transcribe_flat())
[b'\x11', b'"', b'w', b'\x81\x82zyxw', b'\x02', b'AB', b'\x04', b'DEFG']
BinarySingleStruct(class_name: str, struct_format: str, field_names: Union[str, List[str]] = 'value'): OBSOLETE BinaryStruct
A class factory for AbstractBinary namedtuple subclasses
built around potentially complex struct formats.
Parameters:
class_name: name for the generated classstruct_format: the struct format stringfield_names: optional field name list,
a space separated string or an interable of strings;
the default is 'value', intended for single field structsExample:
# an "access point" record from the .ap file
Enigma2APInfo = BinaryStruct('Enigma2APInfo', '>QQ', 'pts offset')
# a "cut" record from the .cuts file
Enigma2Cut = BinaryStruct('Enigma2Cut', '>QL', 'pts type')
>>> UInt16BE = BinaryStruct('UInt16BE', '>H')
>>> UInt16BE.__name__
'UInt16BE'
>>> UInt16BE.format
'>H'
>>> UInt16BE.struct #doctest: +ELLIPSIS
<_struct.Struct object at ...>
>>> field = UInt16BE.from_bytes(bytes((2,3)))
>>> field
UInt16BE('>H',value=515)
>>> field.value
515
Class BinarySingleValue(AbstractBinary): A representation of a single value as the attribute .value`.
Subclasses must implement:
parse or parse_valuetranscribe or transcribe_valueBinarySingleValue.__init__(self, value):
Initialise self with value.
BinarySingleValue.parse(bfr: cs.buffer.CornuCopyBuffer):
Parse an instance from bfr.
Subclasses must implement this method or parse_value.
BinarySingleValue.parse_value(bfr: cs.buffer.CornuCopyBuffer):
Parse a value from bfr based on this class.
Subclasses must implement this method or parse.
BinarySingleValue.parse_value_from_bytes(bs, offset=0, length=None, **kw):
Parse a value from the bytes bs based on this class.
Return (value,offset).
BinarySingleValue.scan_values(bfr: cs.buffer.CornuCopyBuffer, **kw):
Scan bfr, yield values.
BinarySingleValue.transcribe(self):
Transcribe this instance as bytes.
Subclasses must implement this method or transcribe_value.
BinarySingleValue.transcribe_value(value):
Transcribe value as bytes based on this class.
Subclasses must implement this method or transcribe.
BinarySingleValue.value_from_bytes(bs, **from_bytes_kw):
Decode an instance from bs using .from_bytes
and return the .value attribute.
Keyword arguments are passed to cls.from_bytes.
BinaryStruct(class_name: str, struct_format: str, field_names: Union[str, List[str]] = 'value'): OBSOLETE BinaryStruct
OBSOLETE BinaryStruct
A class factory for AbstractBinary namedtuple subclasses
built around potentially complex struct formats.
Parameters:
class_name: name for the generated classstruct_format: the struct format stringfield_names: optional field name list,
a space separated string or an interable of strings;
the default is 'value', intended for single field structsExample:
# an "access point" record from the .ap file
Enigma2APInfo = BinaryStruct('Enigma2APInfo', '>QQ', 'pts offset')
# a "cut" record from the .cuts file
Enigma2Cut = BinaryStruct('Enigma2Cut', '>QL', 'pts type')
>>> UInt16BE = BinaryStruct('UInt16BE', '>H')
>>> UInt16BE.__name__
'UInt16BE'
>>> UInt16BE.format
'>H'
>>> UInt16BE.struct #doctest: +ELLIPSIS
<_struct.Struct object at ...>
>>> field = UInt16BE.from_bytes(bytes((2,3)))
>>> field
UInt16BE('>H',value=515)
>>> field.value
515
Class BinaryUTF16NUL(BinarySingleValue)`: A NUL terminated UTF-16 string.
BinaryUTF16NUL.__init__(self, value: str, *, encoding: str):
pylint: disable=super-init-not-called
BinaryUTF16NUL.VALUE_TYPE
BinaryUTF16NUL.parse(bfr: cs.buffer.CornuCopyBuffer, *, encoding: str):
Parse the encoding and value and construct an instance.
BinaryUTF16NUL.parse_value(bfr: cs.buffer.CornuCopyBuffer, *, encoding: str) -> str:
Read a NUL terminated UTF-16 string from bfr, return a UTF16NULField.
The mandatory parameter encoding specifies the UTF16 encoding to use
('utf_16_be' or 'utf_16_le').
BinaryUTF16NUL.transcribe(self):
Transcribe self.value in UTF-16 with a terminating NUL.
BinaryUTF16NUL.transcribe_value(value: str, encoding='utf-16'):
Transcribe value in UTF-16 with a terminating NUL.
BinaryUTF8NUL.VALUE_TYPE
BinaryUTF8NUL.parse_value(bfr: cs.buffer.CornuCopyBuffer) -> str:
Read a NUL terminated UTF-8 string from bfr, return field.
BinaryUTF8NUL.transcribe_value(s):
Transcribe the value in UTF-8 with a terminating NUL.
binclass(*da, **dkw): A decorator for dataclass-like binary classes.
Example use:
>>> @binclass
... class SomeStruct:
... """A struct containing a count and some flags."""
... count : UInt32BE
... flags : UInt8
>>> ss = SomeStruct(count=3, flags=0x04)
>>> ss
SomeStruct:SomeStruct__dataclass(count=UInt32BE('>L',value=3),flags=UInt8('B',value=4))
>>> print(ss)
SomeStruct(count=3,flags=4)
>>> bytes(ss)
b'\x00\x00\x00\x03\x04'
>>> SomeStruct.promote(b'\x00\x00\x00\x03\x04')
SomeStruct:SomeStruct__dataclass(count=UInt32BE('>L',value=3),flags=UInt8('B',value=4))
Extending an existing @binclass class, for example to add
the body of a structure to some header part:
>>> @binclass
... class HeaderStruct:
... """A header containing a count and some flags."""
... count : UInt32BE
... flags : UInt8
>>> @binclass
... class Packet(HeaderStruct):
... body_text : BSString
... body_data : BSData
... body_longs : BinaryStruct(
... 'longs', '>LL', 'long1 long2'
... )
>>> packet = Packet(
... count=5, flags=0x03,
... body_text="hello",
... body_data=b'xyzabc',
... body_longs=(10,20),
... )
>>> packet
Packet:Packet__dataclass(count=UInt32BE('>L',value=5),flags=UInt8('B',value=3),body_text=BSString('hello'),body_data=BSData(b'xyzabc'),body_longs=longs('>LL',long1=10,long2=20))
>>> print(packet)
Packet(count=5,flags=3,body_text=hello,body_data=b'xyzabc',body_longs=longs(long1=10,long2=20))
>>> packet.body_data
b'xyzabc'
Class bs(builtins.bytes): A bytes subclass with a compact repr().
bs.join(self, chunks):
bytes.join but returning a bs.
bs.promote(obj):
Promote bytes or memoryview to a bs.
Class BSData(BinarySingleValue): A run length encoded data chunk, with the length encoded as a BSUInt`.BSData.data:
An alias for the .value attribute.
BSData.data_offset:
The length of the length indicator,
useful for computing the location of the raw data.
BSData.data_offset_for(bs) -> int:
Compute the data_offset which would obtain for the bytes bs.
BSData.parse_value(bfr: cs.buffer.CornuCopyBuffer) -> bytes:
Parse the data from bfr.
BSData.transcribe_value(data):
Transcribe the payload length and then the payload.
BSSFloat.VALUE_TYPE
BSSFloat.parse_value(bfr: cs.buffer.CornuCopyBuffer) -> float:
Parse a BSSFloat from a buffer and return the float.
BSSFloat.transcribe_value(f):
Transcribe a float.
Class BSString(BinarySingleValue)`: A run length encoded string, with the length encoded as a BSUInt.BSString.VALUE_TYPE
BSString.parse_value(bfr: cs.buffer.CornuCopyBuffer, encoding='utf-8', errors='strict') -> str:
Parse a run length encoded string from bfr.
BSString.transcribe_value(value: str, encoding='utf-8'):
Transcribe a string.
Class BSUInt(BinarySingleValue): A binary serialised unsigned int`.
This uses a big endian byte encoding where continuation octets have their high bit set. The bits contributing to the value are in the low order 7 bits.
BSUInt.VALUE_TYPE
BSUInt.decode_bytes(data, offset=0) -> Tuple[int, int]:
Decode an extensible byte serialised unsigned int from data at offset.
Return value and new offset.
Continuation octets have their high bit set. The octets are big-endian.
If you just have a bytes instance, this is the go. If you're
reading from a stream you're better off with parse or parse_value.
Examples:
>>> BSUInt.decode_bytes(b'\0')
(0, 1)
Note: there is of course the usual AbstractBinary.parse_bytes
but that constructs a buffer to obtain the individual bytes;
this static method will be more performant
if all you are doing is reading this serialisation
and do not already have a buffer.
BSUInt.parse_value(bfr: cs.buffer.CornuCopyBuffer) -> int:
Parse an extensible byte serialised unsigned int from a buffer.
Continuation octets have their high bit set. The value is big-endian.
This is the go for reading from a stream. If you already have
a bare bytes instance then the .decode_bytes static method
is probably most efficient;
there is of course the usual AbstractBinary.parse_bytes
but that constructs a buffer to obtain the individual bytes.
BSUInt.transcribe_value(n):
Encode an unsigned int as an entensible byte serialised octet
sequence for decode. Return the bytes object.
flatten(transcription) -> Iterable[bytes]: Flatten transcription into an iterable of Buffers.
None of the Buffers will be empty.
This exists to allow subclass methods to easily return
transcribable things (having a .transcribe method), ASCII
strings or bytes or iterables or even None, in turn allowing
them simply to return their superclass' chunks iterators
directly instead of having to unpack them.
The supplied transcription may be any of the following:
None: yield nothing.transcribe method: yield from
flatten(transcription.transcribe())Buffer: yield the Buffer if it is not emptystr: yield transcription.encode('ascii')flatten(item) for each item in transcriptionAn example from the cs.iso14496.METABoxBody class:
def transcribe(self):
yield super().transcribe()
yield self.theHandler
yield self.boxes
The binary classes flatten the result of the .transcribe
method to obtain bytes instances for the object's binary
transcription.
Class Float64BE(Float64BE, AbstractBinary): An AbstractBinary namedtuplewhich parses and transcribes the struct format'>d'` and presents the attributes ['value'].
Float64BE.parse(bfr: cs.buffer.CornuCopyBuffer):
Parse from bfr via struct.unpack.
Float64BE.parse_value(bfr: cs.buffer.CornuCopyBuffer) -> float:
Parse a value from bfr, return the value.
Float64BE.promote(obj):
Promote a single value to an instance of cls.
Float64BE.transcribe(self):
Transcribe via struct.pack.
Float64BE.transcribe_value(value):
Transcribe a value back into bytes.
Class Float64LE(Float64LE, AbstractBinary): An AbstractBinary namedtuplewhich parses and transcribes the struct format'<d'` and presents the attributes ['value'].Float64LE.parse(bfr: cs.buffer.CornuCopyBuffer):
Parse from bfr via struct.unpack.
Float64LE.parse_value(bfr: cs.buffer.CornuCopyBuffer) -> float:
Parse a value from bfr, return the value.
Float64LE.promote(obj):
Promote a single value to an instance of cls.
Float64LE.transcribe(self):
Transcribe via struct.pack.
Float64LE.transcribe_value(value):
Transcribe a value back into bytes.
Class Int16BE(Int16BE, AbstractBinary): An AbstractBinary namedtuplewhich parses and transcribes the struct format'>h'` and presents the attributes ['value'].Int16BE.parse(bfr: cs.buffer.CornuCopyBuffer):
Parse from bfr via struct.unpack.
Int16BE.parse_value(bfr: cs.buffer.CornuCopyBuffer) -> int:
Parse a value from bfr, return the value.
Int16BE.promote(obj):
Promote a single value to an instance of cls.
Int16BE.transcribe(self):
Transcribe via struct.pack.
Int16BE.transcribe_value(value):
Transcribe a value back into bytes.
Class Int16LE(Int16LE, AbstractBinary): An AbstractBinary namedtuplewhich parses and transcribes the struct format'<h'` and presents the attributes ['value'].Int16LE.parse(bfr: cs.buffer.CornuCopyBuffer):
Parse from bfr via struct.unpack.
Int16LE.parse_value(bfr: cs.buffer.CornuCopyBuffer) -> int:
Parse a value from bfr, return the value.
Int16LE.promote(obj):
Promote a single value to an instance of cls.
Int16LE.transcribe(self):
Transcribe via struct.pack.
Int16LE.transcribe_value(value):
Transcribe a value back into bytes.
Class Int32BE(Int32BE, AbstractBinary): An AbstractBinary namedtuplewhich parses and transcribes the struct format'>l'` and presents the attributes ['value'].Int32BE.parse(bfr: cs.buffer.CornuCopyBuffer):
Parse from bfr via struct.unpack.
Int32BE.parse_value(bfr: cs.buffer.CornuCopyBuffer) -> int:
Parse a value from bfr, return the value.
Int32BE.promote(obj):
Promote a single value to an instance of cls.
Int32BE.transcribe(self):
Transcribe via struct.pack.
Int32BE.transcribe_value(value):
Transcribe a value back into bytes.
Class Int32LE(Int32LE, AbstractBinary): An AbstractBinary namedtuplewhich parses and transcribes the struct format'<l'` and presents the attributes ['value'].Int32LE.parse(bfr: cs.buffer.CornuCopyBuffer):
Parse from bfr via struct.unpack.
Int32LE.parse_value(bfr: cs.buffer.CornuCopyBuffer) -> int:
Parse a value from bfr, return the value.
Int32LE.promote(obj):
Promote a single value to an instance of cls.
Int32LE.transcribe(self):
Transcribe via struct.pack.
Int32LE.transcribe_value(value):
Transcribe a value back into bytes.
is_single_value(obj): Test whether obj is a single value binary object.
This currently recognises BinarySingleValue instances
and tuple based AbstractBinary instances of length 1.
parse_offsets(*da, **dkw): Decorate parse (usually an AbstractBinary class method)
to record the buffer starting offset as self.offset
and the buffer post parse offset as self.end_offset.
If the decorator parameter report is true,
call bfr.report_offset() with the starting offset at the end of the parse.
pt_spec(pt, name=None, value_type=None, as_repr=None, as_str=None): Convert a parse/transcribe specification pt
into an AbstractBinary subclass.
This is largely used to provide flexibility
in the specifications for the BinaryMultiValue factory
but can also be used as a factory for other simple classes.
If the specification pt is a subclass of AbstractBinary
this is returned directly.
If pt is a (str,str) 2-tuple
the values are presumed to be a format string for struct.struct
and field names separated by spaces;
a new BinaryStruct class is created from these and returned.
Otherwise two functions
f_parse_value(bfr) and f_transcribe_value(value)
are obtained and used to construct a new BinarySingleValue class
as follows:
If pt has .parse_value and .transcribe_value callable attributes,
use those for f_parse_value and f_transcribe_value respectively.
Otherwise, if pt is an int
define f_parse_value to obtain exactly that many bytes from a buffer
and f_transcribe_value to return those bytes directly.
Otherwise presume pt is a 2-tuple of (f_parse_value,f_transcribe_value).
Class SimpleBinary(types.SimpleNamespace, AbstractBinary): Abstract binary class based on a SimpleNamespace, thus providing a nice strand a keyword basedinit. Implementors must still define .parseand.transcribe`.
To constrain the arguments passed to __init__,
define an __init__ which accepts specific keyword arguments
and pass through to super().__init__(). Example:
def __init__(self, *, field1=None, field2):
""" Accept only `field1` (optional)
and `field2` (mandatory).
"""
super().__init__(field1=field1, field2=field2)
struct_field_types(struct_format: str, field_names: Union[str, Iterable[str]]) -> Mapping[str, type]: Construct a dict mapping field names to struct return types.
Example:
>>> struct_field_types('>Hs', 'count text_bs')
{'count': <class 'int'>, 'text_bs': <class 'bytes'>}
Class UInt16BE(UInt16BE, AbstractBinary): An AbstractBinary namedtuplewhich parses and transcribes the struct format'>H'` and presents the attributes ['value'].
UInt16BE.parse(bfr: cs.buffer.CornuCopyBuffer):
Parse from bfr via struct.unpack.
UInt16BE.parse_value(bfr: cs.buffer.CornuCopyBuffer) -> int:
Parse a value from bfr, return the value.
UInt16BE.promote(obj):
Promote a single value to an instance of cls.
UInt16BE.transcribe(self):
Transcribe via struct.pack.
UInt16BE.transcribe_value(value):
Transcribe a value back into bytes.
Class UInt16LE(UInt16LE, AbstractBinary): An AbstractBinary namedtuplewhich parses and transcribes the struct format'<H'` and presents the attributes ['value'].UInt16LE.parse(bfr: cs.buffer.CornuCopyBuffer):
Parse from bfr via struct.unpack.
UInt16LE.parse_value(bfr: cs.buffer.CornuCopyBuffer) -> int:
Parse a value from bfr, return the value.
UInt16LE.promote(obj):
Promote a single value to an instance of cls.
UInt16LE.transcribe(self):
Transcribe via struct.pack.
UInt16LE.transcribe_value(value):
Transcribe a value back into bytes.
Class UInt32BE(UInt32BE, AbstractBinary): An AbstractBinary namedtuplewhich parses and transcribes the struct format'>L'` and presents the attributes ['value'].UInt32BE.parse(bfr: cs.buffer.CornuCopyBuffer):
Parse from bfr via struct.unpack.
UInt32BE.parse_value(bfr: cs.buffer.CornuCopyBuffer) -> int:
Parse a value from bfr, return the value.
UInt32BE.promote(obj):
Promote a single value to an instance of cls.
UInt32BE.transcribe(self):
Transcribe via struct.pack.
UInt32BE.transcribe_value(value):
Transcribe a value back into bytes.
Class UInt32LE(UInt32LE, AbstractBinary): An AbstractBinary namedtuplewhich parses and transcribes the struct format'<L'` and presents the attributes ['value'].UInt32LE.parse(bfr: cs.buffer.CornuCopyBuffer):
Parse from bfr via struct.unpack.
UInt32LE.parse_value(bfr: cs.buffer.CornuCopyBuffer) -> int:
Parse a value from bfr, return the value.
UInt32LE.promote(obj):
Promote a single value to an instance of cls.
UInt32LE.transcribe(self):
Transcribe via struct.pack.
UInt32LE.transcribe_value(value):
Transcribe a value back into bytes.
Class UInt64BE(UInt64BE, AbstractBinary): An AbstractBinary namedtuplewhich parses and transcribes the struct format'>Q'` and presents the attributes ['value'].UInt64BE.parse(bfr: cs.buffer.CornuCopyBuffer):
Parse from bfr via struct.unpack.
UInt64BE.parse_value(bfr: cs.buffer.CornuCopyBuffer) -> int:
Parse a value from bfr, return the value.
UInt64BE.promote(obj):
Promote a single value to an instance of cls.
UInt64BE.transcribe(self):
Transcribe via struct.pack.
UInt64BE.transcribe_value(value):
Transcribe a value back into bytes.
Class UInt64LE(UInt64LE, AbstractBinary): An AbstractBinary namedtuplewhich parses and transcribes the struct format'<Q'` and presents the attributes ['value'].UInt64LE.parse(bfr: cs.buffer.CornuCopyBuffer):
Parse from bfr via struct.unpack.
UInt64LE.parse_value(bfr: cs.buffer.CornuCopyBuffer) -> int:
Parse a value from bfr, return the value.
UInt64LE.promote(obj):
Promote a single value to an instance of cls.
UInt64LE.transcribe(self):
Transcribe via struct.pack.
UInt64LE.transcribe_value(value):
Transcribe a value back into bytes.
Class UInt8(UInt8, AbstractBinary): An AbstractBinary namedtuplewhich parses and transcribes the struct format'B'` and presents the attributes ['value'].UInt8.parse(bfr: cs.buffer.CornuCopyBuffer):
Parse from bfr via struct.unpack.
UInt8.parse_value(bfr: cs.buffer.CornuCopyBuffer) -> int:
Parse a value from bfr, return the value.
UInt8.promote(obj):
Promote a single value to an instance of cls.
UInt8.transcribe(self):
Transcribe via struct.pack.
UInt8.transcribe_value(value):
Transcribe a value back into bytes.
Release 20250501:
Release 20240630:
Release 20240422: New _BinaryMultiValue_Base.for_json() method returning a dict containing the fields.
Release 20240316: Fixed release upload artifacts.
Release 20240201: BREAKING CHANGE: drop the long deprecated PacketField related classes.
Release 20231129: BinaryMultiStruct.parse: promote the buffer arguments to a CornuCopyBuffer.
Release 20230401:
bfr parameter may be any object acceptable to CornuCopyBuffer.promote.Release 20230212:
Release 20221206: Documentation fix.
Release 20220605: BinaryMixin: replace scan_file with scan_fspath, as the former left uncertainty about the amount of the file consumed.
Release 20210316:
Release 20210306: MAJOR RELEASE: The PacketField classes and friends were hard to use; this release supplied a suite of easier to use and more consistent Binary* classes, and ports most of those things based on the old scheme to the new scheme.
Release 20200229:
.length attribute to struct based packet classes providing the data length of the structure (struct.Struct.size).add_deferred_field method to consume the raw data for a field for parsing later (done automatically if the attribute is accessed).@deferred_field decorator for the parser for that stashed data.Release 20191230.3: Docstring tweak.
Release 20191230.2: Documentation updates.
Release 20191230.1: Docstring updates. Semantic changes were in the previous release.
Release 20191230:
skip_fields parameter to omit some field names..transcribe_value method which makes a new instance and calls its .transcribe method.Release 20190220:
Release 20181231: flatten: do not yield zero length bytelike objects, can be misread as EOF on some streams.
Release 20181108:
.value attribute until end of input.Release 20180823:
Release 20180810.2: Documentation improvements.
Release 20180810.1: Improve module description.
Release 20180810: BytesesField.from_buffer: make use of the buffer's skipto method if discard_data is true.
Release 20180805:
Release 20180801: Initial PyPI release.
FAQs
Facilities associated with binary data parsing and transcription. The classes in this module support easy parsing of binary data structures, returning instances with the binary data decoded into attributes and capable of transcribing themselves in binary form (trivially via `bytes(instance)` and also otherwise).
We found that cs-binary demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Product
Socket's new Alert Details page is designed to surface more context, with a clearer layout, reachability dependency chains, and structured review.

Product
Campaign-level threat intelligence in Socket now shows when active supply chain attacks affect your repositories and packages.

Research
Malicious PyPI package sympy-dev targets SymPy users, a Python symbolic math library with 85 million monthly downloads.