Security News
Research
Data Theft Repackaged: A Case Study in Malicious Wrapper Packages on npm
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Python helper library for the Valohai machine learning platform.
pip install valohai-utils
Run locally
python mycode.py
Run in the cloud
vh yaml step mycode.py
vh exec run -a mystep
valohai.yaml
configuration file based on the source codeValohai parameters are variables & hyper-parameters that are parsed from the command-line. You define parameters in a dictionary:
default_parameters = {
'iterations': 100,
'learning_rate': 0.001
}
The dictionary is fed to valohai.prepare()
method:
The values given are default values. You can override them from the command-line or using the Valohai web UI.
import valohai
default_parameters = {
'iterations': 10,
}
valohai.prepare(step="helloworld", default_parameters=default_parameters)
for i in range(valohai.parameters('iterations').value):
print("Iteration %s" % i)
Valohai inputs are the data files required by the experiment. They are automatically downloaded for you, if the data is from a public source. You define inputs with a dictionary:
default_inputs = {
'input_name': 'http://example.com/1.png'
}
An input can also be a list of URLs or a folder:
default_inputs = {
'input_name': [
'http://example.com/1.png',
'http://example.com/2.png'
],
'input_folder': [
's3://mybucket/images/*',
'azure://mycontainer/images/*',
'gs://mybucket/images/*'
]
}
Or it can be an archive full of files (uncompressed automagically on-demand):
default_inputs = {
'images': 'http://example.com/myimages.zip'
}
The dictionary is fed to valohai.prepare()
method.
The url(s) given are defaults. You can override them from the command-line or using the Valohai web UI.
import csv
import valohai
default_inputs = {
'myinput': 'https://pokemon-images-example.s3-eu-west-1.amazonaws.com/pokemon.csv'
}
valohai.prepare(step="test", default_inputs=default_inputs)
with open(valohai.inputs("myinput").path()) as csv_file:
reader = csv.reader(csv_file, delimiter=',')
Valohai outputs are the files that your step produces an end result.
When you are ready to save your output file, you can query for the correct path from the valohai-utils
.
image = Image.open(in_path)
new_image = image.resize((width, height))
out_path = valohai.outputs('resized').path('resized_image.png')
new_image.save(out_path)
Sometimes there are so many outputs that you may want to compress them into a single file.
In this case, once you have all your outputs saved, you can finalize the output with the compress()
method.
valohai.outputs('resized').compress("*.png", "images.zip", remove_originals=True)
You can log metrics using the Valohai metadata system and then render interactive graphs on the web interface. The valohai-utils
logger will print JSON logs that Valohai will parse as metadata.
It is important for visualization that logs for single epoch are flushed out as a single JSON object.
import valohai
for epoch in range(100):
with valohai.metadata.logger() as logger:
logger.log("epoch", epoch)
logger.log("accuracy", accuracy)
logger.log("loss", loss)
import valohai
logger = valohai.logger()
for epoch in range(100):
logger.log("epoch", epoch)
logger.log("accuracy", accuracy)
logger.log("loss", loss)
logger.flush()
valohai.distributed
contains a toolset for running distributed tasks on Valohai.
import valohai
if valohai.distributed.is_distributed_task():
# `master()` reports the same worker on all contexts
master = valohai.distributed.master()
master_url = f'tcp://{master.primary_local_ip}:1234'
# `members()` contains all workers in the distributed task
member_public_ips = ",".join([
m.primary_public_ip
for m
in valohai.distributed.members()
])
# `me()` has full details about the current worker context
details = valohai.distributed.me()
size = valohai.distributed.required_count
rank = valohai.distributed.rank # 0, 1, 2, etc. depending on run context
This example step will do the following:
resized/images.zip
Valohai output file.import os
import valohai
from PIL import Image
default_parameters = {
"width": 640,
"height": 480,
}
default_inputs = {
"images": [
"https://dist.valohai.com/valohai-utils-tests/Example.jpg",
"https://dist.valohai.com/valohai-utils-tests/planeshark.jpg",
],
}
valohai.prepare(step="resize", default_parameters=default_parameters, default_inputs=default_inputs)
def resize_image(in_path, out_path, width, height, logger):
image = Image.open(in_path)
logger.log("from_width", image.size[0])
logger.log("from_height", image.size[1])
logger.log("to_width", width)
logger.log("to_height", height)
new_image = image.resize((width, height))
new_image.save(out_path)
if __name__ == '__main__':
for image_path in valohai.inputs('images').paths():
with valohai.metadata.logger() as logger:
filename = os.path.basename(image_path)
resize_image(
in_path=image_path,
out_path=valohai.outputs('resized').path(filename),
width=valohai.parameters('width').value,
height=valohai.parameters('height').value,
logger=logger
)
valohai.outputs('resized').compress("*", "images.zip", remove_originals=True)
CLI command:
vh yaml step resize.py
Will produce this valohai.yaml
config:
- step:
name: resize
image: python:3.7
command: python ./resize.py {parameters}
parameters:
- name: width
default: 640
multiple-separator: ','
optional: false
type: integer
- name: height
default: 480
multiple-separator: ','
optional: false
type: integer
inputs:
- name: images
default:
- https://dist.valohai.com/valohai-utils-tests/Example.jpg
- https://dist.valohai.com/valohai-utils-tests/planeshark.jpg
optional: false
There are some environment variables that affect how valohai-utils
works when not running within a Valohai execution context.
VH_FLAT_LOCAL_OUTPUTS
If you wish to further develop valohai-utils
, remember to install development dependencies and write tests for your additions.
Lints are run via pre-commit.
If you want pre-commit to check your commits via git hooks,
pip install pre-commit
pre-commit install
You can also run the lints manually with pre-commit run --all-files
.
pip install -e . -r requirements-dev.txt
pytest
FAQs
Unknown package
We found that valohai-utils demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Research
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Research
Security News
Attackers used a malicious npm package typosquatting a popular ESLint plugin to steal sensitive data, execute commands, and exploit developer systems.
Security News
The Ultralytics' PyPI Package was compromised four times in one weekend through GitHub Actions cache poisoning and failure to rotate previously compromised API tokens.