Security News
Research
Data Theft Repackaged: A Case Study in Malicious Wrapper Packages on npm
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
The aoa is available from pypi
pip install aoa
By default, the CLI looks for configuration stored in ~/.aoa/config.yaml
. Copy the config from ModelOps UI -> Session Details -> CLI Config. This will provide the command to create or update the config.yaml
. If required, one can override this configuration at runtime by specifying environment variables (see api_client.py
)
The cli can be used to perform a number of interactions and guides the user to perform those actions.
> aoa -h
usage: aoa [-h] [--debug] [--version] {list,add,retire,run,init,clone,link,message,connection} ...
AOA CLI
optional arguments:
-h, --help show this help message and exit
--debug Enable debug logging
--version Display the version of this tool
actions:
valid actions
{list,add,retire,run,init,clone,link,message,connection}
list List projects, models, local models or datasets
add Add model to working dir
run Train and Evaluate model locally
init Initialize model directory with basic structure
clone Clone Project Repository
link Link repo to Project Repository
connection Manage local connections
feature Manage feature statistics
doctor Diagnose configuration issues
The clone
command provides a convenient way to perform a git clone of the repository associated with a given project. The command can be run interactively and will allow you to select the project you wish to clone. Note that by default it clones to the current working directory so you either need to make sure you create an empty folder and run it from within there or else provide the --path
argument.
> aoa clone -h
usage: aoa clone [-h] [--debug] [-id PROJECT_ID] [-p PATH]
optional arguments:
-h, --help show this help message and exit
--debug Enable debug logging
-id PROJECT_ID, --project-id PROJECT_ID
Id of Project to clone
-p PATH, --path PATH Path to clone repository to
When you create a git repository, its empty by default. The init
command allows you to initialize the repository with the structure required by the AOA. It also adds a default README.md and HOWTO.md.
> aoa init -h
usage: aoa init [-h] [--debug]
optional arguments:
-h, --help show this help message and exit
--debug Enable debug logging
Allows to list the aoa resources. In the cases of listing models (pushed / committed) and datasets, it will prompt the user to select a project prior showing the results. In the case of local models, it lists both committed and non-committed models.
> aoa list -h
usage: aoa list [-h] [--debug] [-p] [-m] [-lm] [-t] [-d] [-c]
optional arguments:
-h, --help show this help message and exit
--debug Enable debug logging
-p, --projects List projects
-m, --models List registered models (committed / pushed)
-lm, --local-models List local models. Includes registered and non-
registered (non-committed / non-pushed)
-t, --templates List dataset templates
-d, --datasets List datasets
-c, --connections List local connections
All results are shown in the format
[index] (id of the resource) name
for example:
List of models for project Demo:
--------------------------------
[0] (03c9a01f-bd46-4e7c-9a60-4282039094e6) Diabetes Prediction
[1] (74eca506-e967-48f1-92ad-fb217b07e181) IMDB Sentiment Analysis
Add a new model to a given repository based on a model template. A model in any other existing ModelOps git repository (specified via the -t <giturl>
) can be used.
> aoa add -h
usage: aoa add [-h] [--debug] -t TEMPLATE_URL -b BRANCH
optional arguments:
-h, --help show this help message and exit
--debug Enable debug logging
-t TEMPLATE_URL, --template-url TEMPLATE_URL
Git URL for template repository
-b BRANCH, --branch BRANCH
Git branch to pull templates
Example usage
> aoa add -t https://github.com/Teradata/modelops-demo-models -b master
The cli can be used to validate the model training and evaluation logic locally before committing to git. This simplifies the development lifecycle and allows you to test and validate many options. It also enables you to avoid creating the dataset definitions in the AOA UI until you are ready and have a finalised version.
> aoa run -h
usage: aoa run [-h] [--debug] [-id MODEL_ID] [-m MODE] [-d DATASET_ID] [-t DATASET_TEMPLATE_ID] [-ld LOCAL_DATASET]
[-lt LOCAL_DATASET_TEMPLATE] [-c CONNECTION]
optional arguments:
-h, --help show this help message and exit
--debug Enable debug logging
-id MODEL_ID, --model-id MODEL_ID
Id of model
-m MODE, --mode MODE Mode (train or evaluate)
-d DATASET_ID, --dataset-id DATASET_ID
Remote datasetId
-t DATASET_TEMPLATE_ID, --dataset-template-id DATASET_TEMPLATE_ID
Remote datasetTemplateId
-ld LOCAL_DATASET, --local-dataset LOCAL_DATASET
Path to local dataset metadata file
-lt LOCAL_DATASET_TEMPLATE, --local-dataset-template LOCAL_DATASET_TEMPLATE
Path to local dataset template metadata file
-c CONNECTION, --connection CONNECTION
Local connection id
You can run all of this as a single command or interactively by selecting some optional arguments, or none of them.
For example, if you want to run the cli interactively you just select aoa run
but if you wanted to run it non interactively to train a given model with a given datasetId you would expect
> aoa run -id <modelId> -m <mode> -d <datasetId>
The connection credentials stored in the ModelOps service cannot be accessed remotely through the CLI for security reasons. Instead, users can manage connection information locally for the CLI. These connections are used by other CLI commands which access Vantage.
> aoa connection -h
usage: aoa connection [-h] {list,add,remove,export} ...
optional arguments:
-h, --help show this help message and exit
actions:
valid actions
{list,add,remove}
list List all local connections
add Add a local connection
remove Remove a local connection
export Export a local connection to be used as a shell script
Manage feature metadata by creating and populating feature metadata table(s). The feature metadata tables contain information required when computing statistics during training, scoring etc. This metadata depends on the feature type (categorical or continuous).
As this metadata can contain sensitive profiling information (such as categories), it is recommended to treat this metadata in the same manner as you treat the features for a given use case. That is, the feature metadata should live in a project or use case level database.
> aoa feature -h
usage: aoa feature [-h] {compute-stats,list-stats,create-stats-table,import-stats} ...
optional arguments:
-h, --help show this help message and exit
action:
valid actions
{compute-stats,list-stats,create-stats-table,import-stats}
compute-stats Compute feature statistics
list-stats List available statistics
create-stats-table Create statistics table
import-stats Import column statistics from local JSON file
Compute the feature metadata information required when computing statistics during training, scoring etc. This metadata depends on the feature type (categorical or continuous).
Continuous: the histograms edges Categorical: the categories
> aoa feature compute-stats -h
usage: aoa feature compute-stats [-h] [--debug] -s SOURCE_TABLE -m METADATA_TABLE [-t {continuous,categorical}] -c
COLUMNS
optional arguments:
-h, --help show this help message and exit
--debug Enable debug logging
-s SOURCE_TABLE, --source-table SOURCE_TABLE
Feature source table/view
-m METADATA_TABLE, --metadata-table METADATA_TABLE
Metadata table for feature stats, including database name
-t {continuous,categorical}, --feature-type {continuous,categorical}
Feature type: continuous or categorical
-c COLUMNS, --columns COLUMNS
List of feature columns
Example usage
aoa feature compute-stats \
-s <feature-db>.<feature-data> \
-m <feature-metadata-db>.<feature-metadata-table> \
-t continuous -c numtimesprg,plglcconc,bloodp,skinthick,twohourserins,bmi,dipedfunc,age
A number of authentication methods are supported for both the CLI and SDK.
When working interactively, the recommended auth method for the CLI is device_code
. It will guide you through the auth automatically. For the SDK, use bearer
if working interactively.
For both CLI and SDK, if working in an automated service-service manner, use client_credentials
.
The SDK for ModelOps allows users to interact with ModelOps APIs from anywhere they can execute python such as notebooks, IDEs etc. It can also be used for devops to automate additional parts of the process and integrate into the wider organization.
By default, creating an instance of the AoaClient
looks for configuration stored in ~/.aoa/config.yaml
. When working with the SDK, we recommend that you specify (and override) all the necessary configuration as part of the AoaClient
invocation.
An example to create a client using a bearer token for a given project is
from aoa import AoaClient
client = AoaClient(
aoa_url="<modelops-endpoint>",
auth_mode="bearer",
auth_bearer="<bearer-token>",
project_id="23e1df4b-b630-47a1-ab80-7ad5385fcd8d",
)
To get the values to use for bearer token and aoa_url, go to the ModelOps UI -> Session Details -> SDK Config.
We provide an extensive sdk implementation to interact with the APIs. You can find, create, update, archive, etc any entity that supports it via the SDK. In addition, most if not all search endpoints are also implemented in the sdk. Here are some examples
from aoa import AoaClient, DatasetApi, JobApi
import pprint
client = AoaClient(project_id="23e1df4b-b630-47a1-ab80-7ad5385fcd8d")
dataset_api = DatasetApi(aoa_client=client)
datasets = dataset_api.find_all()
pprint.pprint(datasets)
dataset = dataset_api.find_by_id("11e1df4b-b630-47a1-ab80-7ad5385fcd8c")
pprint.pprint(dataset)
job_api = JobApi(aoa_client=client)
jobs = job_api.find_by_id("21e1df4b-b630-47a1-ab80-7ad5385fcd1c")
pprint.pprint(jobs)
Let's assume we have a model version 4131df4b-b630-47a1-ab80-7ad5385fcd15
which we want to deploy In-Vantage and schedule it to execute once a month at midnight of the first day of the month using dataset connection 11e1df4b-b630-47a1-ab80-7ad5385fcd8c
and dataset template d8a35d98-21ce-47d0-b9f2-00d355777de1
. We can use the SDK as follows to perform this.
from aoa import AoaClient, TrainedModelApi, JobApi
client = AoaClient(project_id="23e1df4b-b630-47a1-ab80-7ad5385fcd8d")
trained_model_api = TrainedModelApi(aoa_client=client)
job_api = JobApi(aoa_client=client)
trained_model_id = "4131df4b-b630-47a1-ab80-7ad5385fcd15"
deploy_request = {
"engineType": "IN_VANTAGE",
"publishOnly": False,
"language": "PMML",
"cron": "0 0 1 * *",
"byomModelLocation": {
"database": "<db-name>",
"table": "<table-name>"
},
"datasetConnectionId": "11e1df4b-b630-47a1-ab80-7ad5385fcd8c",
"datasetTemplateId": "d8a35d98-21ce-47d0-b9f2-00d355777de1",
"engineTypeConfig": {
"dockerImage": "",
"engine": "byom",
"resources": {
"memory": "1G",
"cpu": "1"
}
}
}
job = trained_model_api.deploy(trained_model_id, deploy_request)
# wait until the job completes (if the job fails it will raise an exception)
job_api.wait(job['id'])
Let's assume we have a PMML model which we have trained in another data science platform. We want to import the artefacts for this version (model.pmml and data_stats.json) against a BYOM model f937b5d8-02c6-5150-80c7-1e4ff07fea31
.
from aoa import (
AoaClient,
ModelApi,
TrainedModelApi,
TrainedModelArtefactsApi,
JobApi
)
import uuid
client = AoaClient(project_id="23e1df4b-b630-47a1-ab80-7ad5385fcd8d")
model_api = ModelApi(aoa_client=client)
trained_model_api = TrainedModelApi(aoa_client=client)
trained_model_artefacts_api = TrainedModelArtefactsApi(aoa_client=client)
job_api = JobApi(aoa_client=client)
artefacts_import_id = uuid.uuid4()
artefacts = ["model.pmml", "data_stats.json"]
# first, upload the artefacts which we want to associate with the BYOM model version
trained_model_artefacts_api.upload_artefacts(artefacts_import_id, artefacts)
import_request = {
"artefactImportId": str(artefacts_import_id),
"externalId": "my-byom-version-id"
}
# update with id of your model
model_id = "<model-uuid>"
job = model_api.import_byom(model_id, import_request)
# wait until the job completes (if the job fails it will raise an exception)
job_api.wait(job["id"])
# now you can list the artefacts which were uploaded and linked to the model version
trained_model_id = job["metadata"]["trainedModel"]["id"]
artefacts = trained_model_artefacts_api.list_artefacts(trained_model_id)
device_code
grant flowaoa init
and aoa add
not working due to refactor in 6.1.3aoa configure
aoa link
for linking project to repo locallyaoa feature create-table
aoa feature
support for managing feature metadataaoa add
uses reference git repository for model templatespath
argument didn't create .aoa/config.yaml in correct directorypath
now uses repository name by defaultaoa clone
respects project branchFAQs
Python client for Teradata AnalyticOps Accelerator (AOA)
We found that aoa demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Research
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Research
Security News
Attackers used a malicious npm package typosquatting a popular ESLint plugin to steal sensitive data, execute commands, and exploit developer systems.
Security News
The Ultralytics' PyPI Package was compromised four times in one weekend through GitHub Actions cache poisoning and failure to rotate previously compromised API tokens.