Security News
Research
Data Theft Repackaged: A Case Study in Malicious Wrapper Packages on npm
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
IMPORTANT: OpenDataLab SDK WIP, not ensure the necessary compatibility of OpenAPI and SDK. As a result, please use the SDK with the latest version.
OpenDataLab Python SDK is a python library to access Opendatalab
and use open datasets.
It provides:
odl
to access open datasets.$ pip3 install opendatalab
An account is needed to access to opendatalab platform. Please visit offical websit to get the account username and password first.
Show cmd help
$ odl -h
$ odl --help
Usage: odl [OPTIONS] COMMAND [ARGS]...
You can use `odl <command>` to access open datasets.
Options:
--version Show the version and exit.
-h, --help Show this message and exit.
Commands:
get Get(Download) dataset files into local path.
info Print dataset info.
login Login opendatalab with account.
logout Logout opendatalab account.
ls List files of the dataset.
search Search dataset info.
version Show opendatalab version.
$ odl version
odl version, current: 0.0.6, svc: 1.8
Login with opendatalab username and password. If you haven't an opendatalab account,please register with link: https://opendatalab.org.cn/
$ odl login
Username []: someone@example.com
Password []:
Login successfully as someone@example.com
or
$ odl login -u someone@example.com
Password[]:
Logout current opendatalab account
$ odl logout
Do you want to logout? [y/N]: y
someone@example.com logout
List dataset files, support prefix of sub_directory
# list all dataset files
$ odl ls MNIST
total: 4, size: 11.1M
+----------------------------+--------------+
| File Name | Size |
+----------------------------+--------------+
| train-labels-idx1-ubyte.gz | 28.2K |
+----------------------------+--------------+
| train-images-idx3-ubyte.gz | 9.5M |
+----------------------------+--------------+
| t10k-labels-idx1-ubyte.gz | 4.4K |
+----------------------------+--------------+
| t10k-images-idx3-ubyte.gz | 1.6M |
+----------------------------+--------------+ 1.6M
# list sub directory files
$ odl ls MNIST/t10k
total: 2, size: 1.6M
+---------------------------+--------------+
| File Name | Size |
+---------------------------+--------------+
| t10k-labels-idx1-ubyte.gz | 4.4K |
+---------------------------+--------------+
| t10k-images-idx3-ubyte.gz | 1.6M |
+---------------------------+--------------+
# download dataset files into local
# get all files of dataset
$ odl get MNIST
# get partial files of dataset
$ odl get MNIST/t10k
import json
from opendatalab.__version__ import __url__
from opendatalab.cli.get import implement_get
from opendatalab.cli.info import implement_info
from opendatalab.cli.login import implement_login
from opendatalab.cli.ls import implement_ls
from opendatalab.cli.search import implement_search
from opendatalab.cli.utility import ContextInfo
if __name__ == '__main__':
"""
ContextInfo: default
please use shell login first, use: opendatalab login
"""
ctx = ContextInfo(__url__, "")
client = ctx.get_client()
odl_api = client.get_api()
# 0. login with account
# account = "xxxxx" # your username
# pw = "xxxxx" # your password
# print(f'*****'*8)
# implement_login(ctx, account, pw)
# 1. search demo
res_list = odl_api.search_dataset("coco")
for index, res in enumerate(res_list):
print(f"index: {index}, result: {res['name']}")
# implement_search("coco")
print(f'*****'*8)
# 2. list demo
implement_ls(ctx, 'TAO')
print(f'*****' * 8)
# 3. read file online demo
dataset = client.get_dataset('FB15k')
with dataset.get('meta/info.json', False) as fd:
content = json.load(fd)
print(f"{content}")
print(f'*****'*8)
# 4. get dataset info
implement_info(ctx, 'FB15k')
# 5. download
# get all files of dataset
# implement_get(ctx, "MNIST", 4, 0)
# get partial files of dataset
implement_get(ctx, "GOT-10k/data/test_data.zip", 4, 0) # 139, zip 1.16G GOT-10k
print(f'*****' * 5)
More information can be found on the documentation site
FAQs
OpenDataLab Python SDK
We found that opendatalab demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 2 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Research
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Research
Security News
Attackers used a malicious npm package typosquatting a popular ESLint plugin to steal sensitive data, execute commands, and exploit developer systems.
Security News
The Ultralytics' PyPI Package was compromised four times in one weekend through GitHub Actions cache poisoning and failure to rotate previously compromised API tokens.