Security News
Research
Data Theft Repackaged: A Case Study in Malicious Wrapper Packages on npm
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
TreeValue
is a generalized tree-based data structure mainly developed by OpenDILab Contributors.
Almost all the operations can be supported in the form of trees in a convenient way to simplify the structure processing when the calculation is tree-based.
When we build a complex nested structure, we need to model it as a tree structure, and the native list and dict in Python are often used to solve this problem. However, it takes a lot of codes and some complex and non-intuitive calculation logic, which is not easy to modify and extend related code and data, and parallelization is impossible.
Therefore, we need a kind of more proper data container, named TreeValue
. It is designed for solving the following problems:
treevalue
has been fully tested in the Linux, macOS and Windows environments and with multiple Python versions, and it works properly on all these platforms.
However, treevalue
currently does not support PyPy, so just pay attention to this when using it.
You can simply install it with pip
command line from the official PyPI site.
pip install treevalue
Or just from the source code on github
pip install git+https://github.com/opendilab/treevalue.git@main
For more information about installation, you can refer to the installation guide.
After this, you can check if the installation is processed properly with the following code
from treevalue import __version__
print('TreeValue version is', __version__)
You can easily create a tree value object based on FastTreeValue
.
from treevalue import FastTreeValue
if __name__ == '__main__':
t = FastTreeValue({
'a': 1,
'b': 2.3,
'x': {
'c': 'str',
'd': [1, 2, None],
'e': b'bytes',
}
})
print(t)
The result should be
<FastTreeValue 0x7f6c7df00160 keys: ['a', 'b', 'x']>
├── 'a' --> 1
├── 'b' --> 2.3
└── 'x' --> <FastTreeValue 0x7f6c81150860 keys: ['c', 'd', 'e']>
├── 'c' --> 'str'
├── 'd' --> [1, 2, None]
└── 'e' --> b'bytes'
And t
is structure should be like this
Not only a visible tree structure, but abundant operation supports is provided.
You can just put objects (such as torch.Tensor
, or any other types) here and just
call their methods, like this
import torch
from treevalue import FastTreeValue
t = FastTreeValue({
'a': torch.rand(2, 5),
'x': {
'c': torch.rand(3, 4),
}
})
print(t)
# <FastTreeValue 0x7f8c069346a0>
# ├── a --> tensor([[0.3606, 0.2583, 0.3843, 0.8611, 0.5130],
# │ [0.0717, 0.1370, 0.1724, 0.7627, 0.7871]])
# └── x --> <FastTreeValue 0x7f8ba6130f40>
# └── c --> tensor([[0.2320, 0.6050, 0.6844, 0.3609],
# [0.0084, 0.0816, 0.8740, 0.3773],
# [0.6523, 0.4417, 0.6413, 0.8965]])
print(t.shape) # property access
# <FastTreeValue 0x7f8c06934ac0>
# ├── a --> torch.Size([2, 5])
# └── x --> <FastTreeValue 0x7f8c069346d0>
# └── c --> torch.Size([3, 4])
print(t.sin()) # method call
# <FastTreeValue 0x7f8c06934b80>
# ├── a --> tensor([[0.3528, 0.2555, 0.3749, 0.7586, 0.4908],
# │ [0.0716, 0.1365, 0.1715, 0.6909, 0.7083]])
# └── x --> <FastTreeValue 0x7f8c06934b20>
# └── c --> tensor([[0.2300, 0.5688, 0.6322, 0.3531],
# [0.0084, 0.0816, 0.7669, 0.3684],
# [0.6070, 0.4275, 0.5982, 0.7812]])
print(t.reshape((2, -1))) # method with arguments
# <FastTreeValue 0x7f8c06934b80>
# ├── a --> tensor([[0.3606, 0.2583, 0.3843, 0.8611, 0.5130],
# │ [0.0717, 0.1370, 0.1724, 0.7627, 0.7871]])
# └── x --> <FastTreeValue 0x7f8c06934b20>
# └── c --> tensor([[0.2320, 0.6050, 0.6844, 0.3609, 0.0084, 0.0816],
# [0.8740, 0.3773, 0.6523, 0.4417, 0.6413, 0.8965]])
print(t[:, 1:-1]) # index operator
# <FastTreeValue 0x7f8ba5c8eca0>
# ├── a --> tensor([[0.2583, 0.3843, 0.8611],
# │ [0.1370, 0.1724, 0.7627]])
# └── x --> <FastTreeValue 0x7f8ba5c8ebe0>
# └── c --> tensor([[0.6050, 0.6844],
# [0.0816, 0.8740],
# [0.4417, 0.6413]])
print(1 + (t - 0.8) ** 2 * 1.5) # math operators
# <FastTreeValue 0x7fdfa5836b80>
# ├── a --> tensor([[1.6076, 1.0048, 1.0541, 1.3524, 1.0015],
# │ [1.0413, 1.8352, 1.2328, 1.7904, 1.0088]])
# └── x --> <FastTreeValue 0x7fdfa5836880>
# └── c --> tensor([[1.1550, 1.0963, 1.3555, 1.2030],
# [1.0575, 1.4045, 1.0041, 1.0638],
# [1.0782, 1.0037, 1.5075, 1.0658]])
For more examples, explanations and further usages, take a look at:
We provide an official treevalue-based-wrapper for numpy and torch called DI-treetensor since the treevalue
is often used with libraries like numpy
and torch
. It will actually be helpful while working with AI fields.
Here is the speed performance of all the operations in FastTreeValue
; the following table is the performance comparison result with dm-tree.
(In DM-Tree, the unflatten
operation is different from that in TreeValue, see: Comparison Between TreeValue and DM-Tree for more details.)
flatten | flatten(with path) | mapping | mapping(with path) | |
---|---|---|---|---|
treevalue | --- | 511 ns ± 6.92 ns | 3.16 µs ± 42.8 ns | 1.58 µs ± 30 ns |
flatten | flatten_with_path | map_structure | map_structure_with_path | |
dm-tree | 830 ns ± 8.53 ns | 11.9 µs ± 358 ns | 13.3 µs ± 87.2 ns | 62.9 µs ± 2.26 µs |
The following 2 tables are the performance comparison result with jax pytree.
mapping | mapping(with path) | flatten | unflatten | flatten_values | flatten_keys | |
---|---|---|---|---|---|---|
treevalue | 2.21 µs ± 32.2 ns | 2.16 µs ± 123 ns | 515 ns ± 7.53 ns | 601 ns ± 5.99 ns | 301 ns ± 12.9 ns | 451 ns ± 17.3 ns |
tree_map | (Not Implemented) | tree_flatten | tree_unflatten | tree_leaves | tree_structure | |
jax pytree | 4.67 µs ± 184 ns | --- | 1.29 µs ± 27.2 ns | 742 ns ± 5.82 ns | 1.29 µs ± 22 ns | 1.27 µs ± 16.5 ns |
flatten + all | flatten + reduce | flatten + reduce(with init) | rise(given structure) | rise(automatic structure) | |
---|---|---|---|---|---|
treevalue | 425 ns ± 9.33 ns | 702 ns ± 5.93 ns | 793 ns ± 13.4 ns | 9.14 µs ± 129 ns | 11.5 µs ± 182 ns |
tree_all | tree_reduce | tree_reduce(with init) | tree_transpose | (Not Implemented) | |
jax pytree | 1.47 µs ± 37 ns | 1.88 µs ± 27.2 ns | 1.91 µs ± 47.4 ns | 10 µs ± 117 ns | --- |
This is the comparison between dm-tree, jax-libtree and us, with flatten
and mapping
operations (lower value means less time cost and runs faster)
The following table is the performance comparison result with tianshou Batch.
get | set | init | deepcopy | stack | cat | split | |
---|---|---|---|---|---|---|---|
treevalue | 51.6 ns ± 0.609 ns | 64.4 ns ± 0.564 ns | 750 ns ± 14.2 ns | 88.9 µs ± 887 ns | 50.2 µs ± 771 ns | 40.3 µs ± 1.08 µs | 62 µs ± 1.2 µs |
tianshou Batch | 43.2 ns ± 0.698 ns | 396 ns ± 8.99 ns | 11.1 µs ± 277 ns | 89 µs ± 1.42 µs | 119 µs ± 1.1 µs | 194 µs ± 1.81 µs | 653 µs ± 17.8 µs |
And this is the comparison between Tianshou Batch and us, with cat
, stack
and split
operations (lower value means less time cost and runs faster)
Test benchmark code can be found here:
Welcome to OpenDILab community - treevalue!
If you meet some problem or have some brilliant ideas, you can file an issue.
Scan the QR code and add us on Wechat:
Or just contact us with slack or email (opendilab.contact@gmail.com).
Please check Contributing Guidances.
Thanks to the following contributors!
@misc{treevalue,
title={{TreeValue} - Tree-Structure Computing Solution},
author={TreeValue Contributors},
publisher = {GitHub},
howpublished = {\url{https://github.com/opendilab/treevalue}},
year={2021},
}
treevalue
released under the Apache 2.0 license. See the LICENSE file for details.
FAQs
A flexible, generalized tree-based data structure.
We found that treevalue demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Research
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Research
Security News
Attackers used a malicious npm package typosquatting a popular ESLint plugin to steal sensitive data, execute commands, and exploit developer systems.
Security News
The Ultralytics' PyPI Package was compromised four times in one weekend through GitHub Actions cache poisoning and failure to rotate previously compromised API tokens.