SageMaker Training and Deployment Utilities
This package provides utilities for training and deploying machine learning models on AWS SageMaker. It supports multiple ML frameworks including PyTorch, TensorFlow, XGBoost, Scikit-learn, and HuggingFace.
Installation
To install the package, run:
npm install @aws-sdk/client-sagemaker @aws-sdk/client-s3 @aws-sdk/lib-storage archiver
Usage
Training
To train a model using SageMaker, you can use the provided classes for each framework. Below are examples for PyTorch and TensorFlow.
PyTorch Training
import { PyTorchTraining } from './sagemaker-framework-extensions';
import { Logger } from './interfaces';
const config = {
region: 'us-west-2',
credentials: {
accessKeyId: 'your-access-key-id',
secretAccessKey: 'your-secret-access-key',
},
bucket: 'your-s3-bucket',
role: 'your-sagemaker-role',
service: 'your-service',
model: 'your-model',
};
const logger: Logger = console;
const sourceDir = './path-to-your-source-code';
const pytorchTraining = new PyTorchTraining(config, sourceDir, logger);
const frameworkConfig = {
frameworkVersion: '2.1',
pythonVersion: 'py310',
imageUri: 'your-custom-image-uri',
};
const resourceConfig = {
instanceCount: 1,
instanceType: 'ml.p3.2xlarge',
volumeSizeGB: 50,
};
const hyperParameters = {
learningRate: 0.001,
batchSize: 32,
epochs: 10,
};
const inputDataS3 = {
data: 's3://your-bucket/path-to-your-data',
format: 'application/json',
};
async function trainModel() {
try {
const metadata = await pytorchTraining.train(
frameworkConfig,
resourceConfig,
hyperParameters,
inputDataS3,
[],
true
);
console.log('Training completed:', metadata);
const trainingJobName = metadata.trainingJobName;
console.log('Training job name:', trainingJobName);
} catch (error) {
console.error('Training failed:', error);
}
}
trainModel();
TensorFlow Training
import { TensorFlowTraining } from './sagemaker-framework-extensions';
import { Logger } from './interfaces';
const config = {
region: 'us-west-2',
credentials: {
accessKeyId: 'your-access-key-id',
secretAccessKey: 'your-secret-access-key',
},
bucket: 'your-s3-bucket',
role: 'your-sagemaker-role',
service: 'your-service',
model: 'your-model',
};
const logger: Logger = console;
const sourceDir = './path-to-your-source-code';
const tensorflowTraining = new TensorFlowTraining(config, sourceDir, logger);
const frameworkConfig = {
frameworkVersion: '2.12',
pythonVersion: 'py310',
imageUri: 'your-custom-image-uri',
};
const resourceConfig = {
instanceCount: 1,
instanceType: 'ml.p3.2xlarge',
volumeSizeGB: 50,
};
const hyperParameters = {
learningRate: 0.001,
batchSize: 32,
epochs: 10,
};
const inputDataS3 = {
data: 's3://your-bucket/path-to-your-data',
format: 'application/json',
};
async function trainModel() {
try {
const metadata = await tensorflowTraining.train(
frameworkConfig,
resourceConfig,
hyperParameters,
inputDataS3,
[],
true
);
console.log('Training completed:', metadata);
const trainingJobName = metadata.trainingJobName;
console.log('Training job name:', trainingJobName);
} catch (error) {
console.error('Training failed:', error);
}
}
trainModel();
Deployment
To deploy a trained model using SageMaker, you can use the provided classes for each framework. Below are examples for PyTorch and TensorFlow.
PyTorch Deployment
import { PyTorchDeployment } from './deploy';
import { Logger } from './interfaces';
const config = {
region: 'us-west-2',
credentials: {
accessKeyId: 'your-access-key-id',
secretAccessKey: 'your-secret-access-key',
},
bucket: 'your-s3-bucket',
role: 'your-sagemaker-role',
environmentVariables: {},
};
const logger: Logger = console;
const service = 'your-service';
const model = 'your-model';
const pytorchDeployment = new PyTorchDeployment(config, logger, service, model);
const deployInput = {
frameworkVersion: '2.1',
pythonVersion: 'py310',
entryPoint: 'inference.py',
trainingJobName: 'your-training-job-name',
useGpu: true,
};
const serverlessConfig = {
memorySizeInMb: 2048,
maxConcurrency: 10,
};
async function deployModel() {
try {
const result = await pytorchDeployment.deploy(deployInput, serverlessConfig);
console.log('Deployment completed:', result);
} catch (error) {
console.error('Deployment failed:', error);
}
}
deployModel();
TensorFlow Deployment
import { TensorFlowDeployment } from './deploy';
import { Logger } from './interfaces';
const config = {
region: 'us-west-2',
credentials: {
accessKeyId: 'your-access-key-id',
secretAccessKey: 'your-secret-access-key',
},
bucket: 'your-s3-bucket',
role: 'your-sagemaker-role',
environmentVariables: {},
};
const logger: Logger = console;
const service = 'your-service';
const model = 'your-model';
const tensorflowDeployment = new TensorFlowDeployment(config, logger, service, model);
const deployInput = {
frameworkVersion: '2.12',
pythonVersion: 'py310',
entryPoint: 'inference.py',
trainingJobName: 'your-training-job-name',
useGpu: true,
};
const serverlessConfig = {
memorySizeInMb: 2048,
maxConcurrency: 10,
};
async function deployModel() {
try {
const result = await tensorflowDeployment.deploy(deployInput, serverlessConfig);
console.log('Deployment completed:', result);
} catch (error) {
console.error('Deployment failed:', error);
}
}
deployModel();
License
This project is licensed under the MIT License.