
Security News
npm Adopts OIDC for Trusted Publishing in CI/CD Workflows
npm now supports Trusted Publishing with OIDC, enabling secure package publishing directly from CI/CD workflows without relying on long-lived tokens.
pip install data2objects
or just copy data2objects.py
into your project.
The best way to explain the use of data2objects
is via an example. Consider the following config.yaml
file:
backbone:
activation: +torch.nn.SiLU()
hidden_size: 1024
readout:
+torch.nn.Linear:
in_features: =/backbone/hidden_size
out_features: 1
Parsing this file using data2objects.from_yaml
returns the following:
>>> import data2objects
>>> config = data2objects.from_yaml("config.yaml")
>>> print(config)
{'backbone': {'activation': SiLU(), 'hidden_size': 1024},
'readout': Linear(in_features=1024, out_features=1, bias=True)}
Under-the-hood, data2objects
has done the following:
"="
and replaced them with the corresponding values in the nested data structure
=/backbone/hidden_size
was replaced with 1024
."+"
, imported the corresponding objects from the provided modules and:
"()"
, i.e. "+torch.nn.SiLU()"
created a SiLU
object."+torch.nn.Linear: {in_features: =/backbone/hidden_size, out_features: 1}"
created a Linear
object with in_features=1024
and out_features=1
.data2objects
exposes two functions, from_dict
and from_yaml
, which can be used to transform a nested data structure into a set of instantiated Python objects.
from_yaml
def from_yaml(thing: str | Path, modules: list[object] | None = None) -> dict:
Load a nested dictionary from a yaml file or string, and parse it using
data2objects.from_dict
.If
thing
points to an existing file, the data in the file is loaded. Otherwise, the string is treated as containing the raw yaml data.Parameters
thing:
str | Path
The yaml file or string to load.modules:
list[object] | None
A list of modules to look up non-fully qualified names in.
Returns
dict
The transformed data.
from_dict
def from_dict(
data: dict[K, V], modules: list[object] | None = None
) -> dict[K, V | Any]:
Transform a nested
data
structure into instantiated Python objects. This function recursively processes the input data, and applies the following special handling to anystr
objects:Reference handling:
Any leaf-nodes within
data
that are strings and start with"="
are interpreted as references to other parts ofdata
. The syntax for these references follows the same rules as unix paths:
"=/path"
: resolvepath
relative to the root of thedata
structure."=./path"
: resolvepath
relative to the current working directory."=../path"
: resolvepath
relative to the parent of the current working directory.Object instantiation:
The following handling applied to any
str
objects found withindata
( either as a key or value) that start with"+"
:
- attempt to import the python object specified by the string: e.g. the string
"+torch.nn.Tanh"
will be converted to theTanh
class (not an instance) from thetorch.nn
module. If the string is not an absolute path (i.e. does not contain any dots), we attempt to import it from the python standard library, or any of the provided modules:
"+Path"
withmodules=[pathlib]
will be converted to thePath
class from thepathlib
module."+tuple"
will be converted to thetuple
type.- if the string ends with a
"()"
, the resulting object is called with no arguments e.g."+my_module.MyClass()"
will be converted to an instance ofMyClass
frommy_module
. This is equivalent to+my_module.MyClass: {}
(see below).- if the string is found as key in a mapping with exactly one key-value pair, then:
- if the value is itself a mapping, the single-item mapping is replaced with the result of calling the imported object with the recursively instantiated values as keyword arguments
- otherwise, the single-item mapping is replaced with the result of calling the imported object with the instantiated value as a single positional argument
Parameters
data:
dict[K, V]
The data to transform.modules:
list[object] | None
A list of modules to look up non-fully qualified names in.Returns
dict
The transformed data.Examples
A basic example:
>>> instantiate_from_data({"activation": "+torch.nn.Tanh()"}) {'activation': Tanh()}
Note the importance of trailing parentheses:
>>> instantiate_from_data({"activation": "+torch.nn.Tanh"}) {'activation': <class 'torch.nn.modules.activation.Tanh'>}
Alternatively, point
instantiate_from_data
to automatically import fromtorch.nn
:
>>> instantiate_from_data({"activation": "+Tanh()"}, modules=[torch.nn]) {'activation': Tanh()}
Use single-item mappings to instantiate classes/call functions with arguments. The following syntax will internally import
MyClass
frommy_module
, and call it asMyClass(x=1, y=2)
with explicit keyword arguments:
>>> instantiate_from_data({ ... "activation": "+torch.nn.ReLU()", ... "model": { ... "+MyClass": {"x": 1, "y": 2} ... } ... }) {'activation': ReLU(), 'model': MyClass(x=1, y=2)}
In contrast, the following syntax call the imported objects with a single positional argument:
>>> instantiate_from_data({"+len": [1, 2, 3]}) 3 # i.e. len([1, 2, 3])
Mapping with multiple keys are still processed, but are never used to instantiate classes/call functions:
>>> instantiate_from_data({"+len": [1, 2, 3], "+print": "hello"}) {<built-in function len>: [1, 2, 3], <built-in function print>: 'hello'}
instantiate_from_data
also works with arbitrary nesting:
>>> instantiate_from_data({"model": {"activation": "+torch.nn.Tanh()"}}) {'model': {'activation': Tanh()}}
Caution:
instantiate_from_data
can lead to side-effects!
>>> instantiate_from_data({"+print": "hello"}) hello
References are resolved before object instantiation, so all of the following will resolve the
"length"
field to3
:
>>> instantiate_from_data({"args": [1, 2, 3], "length": {"+len": "!../args"}}) 3 >>> instantiate_from_data({"args": [1, 2, 3], "length": {"+len": "!~args"}}) 3
FAQs
Transform nested data structures into Python objects
We found that data2objects demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
npm now supports Trusted Publishing with OIDC, enabling secure package publishing directly from CI/CD workflows without relying on long-lived tokens.
Research
/Security News
A RubyGems malware campaign used 60 malicious packages posing as automation tools to steal credentials from social media and marketing tool users.
Security News
The CNA Scorecard ranks CVE issuers by data completeness, revealing major gaps in patch info and software identifiers across thousands of vulnerabilities.