Security News
Research
Data Theft Repackaged: A Case Study in Malicious Wrapper Packages on npm
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Aana SDK is a powerful framework for building multimodal applications. It facilitates the large-scale deployment of machine learning models, including those for vision, audio, and language, and supports Retrieval-Augmented Generation (RAG) systems. This enables the development of advanced applications such as search engines, recommendation systems, and data insights platforms.
The SDK is designed according to the following principles:
The SDK is still in development, and not all features are fully implemented. We are constantly working on improving the SDK, and we welcome any feedback or suggestions.
Check out the documentation for more information.
Nowadays, it is getting easier to experiment with machine learning models and build prototypes. However, deploying these models at scale and integrating them into real-world applications is still a challenge.
Aana SDK simplifies this process by providing a framework that allows:
Model Deployment:
API Generation:
Predefined Types:
Documentation Generation:
Streaming Support:
Task Queue Support:
Integrations:
To install Aana SDK via PyPI, you can use the following command:
pip install aana
For optimal performance install PyTorch version >=2.1 appropriate for your system. You can skip it, but it will install a default version that may not make optimal use of your system's resources, for example, a GPU or even some SIMD operations. Therefore we recommend choosing your PyTorch package carefully and installing it manually.
Some models use Flash Attention. Install Flash Attention library for better performance. See flash attention installation instructions for more details and supported GPUs.
git clone https://github.com/mobiusml/aana_sdk.git
For optimal performance install PyTorch version >=2.1 appropriate for your system. You can continue directly to the next step, but it will install a default version that may not make optimal use of your system's resources, for example, a GPU or even some SIMD operations. Therefore we recommend choosing your PyTorch package carefully and installing it manually.
Some models use Flash Attention. Install Flash Attention library for better performance. See flash attention installation instructions for more details and supported GPUs.
The project is managed with Poetry. See the Poetry installation instructions on how to install it on your system.
It will install the package and all dependencies in a virtual environment.
sh install.sh
You can quickly develop multimodal applications using Aana SDK's intuitive APIs and components.
If you want to start building a new application, you can use the following GitHub template: Aana App Template. It will help you get started with the Aana SDK and provide you with a basic structure for your application and its dependencies.
Let's create a simple application that transcribes a video. The application will download a video from YouTube, extract the audio, and transcribe it using an ASR model.
Aana SDK already provides a deployment for ASR (Automatic Speech Recognition) based on the Whisper model. We will use this deployment in the example.
from aana.api.api_generation import Endpoint
from aana.core.models.video import VideoInput
from aana.deployments.aana_deployment_handle import AanaDeploymentHandle
from aana.deployments.whisper_deployment import (
WhisperComputeType,
WhisperConfig,
WhisperDeployment,
WhisperModelSize,
WhisperOutput,
)
from aana.integrations.external.yt_dlp import download_video
from aana.processors.remote import run_remote
from aana.processors.video import extract_audio
from aana.sdk import AanaSDK
# Define the model deployments.
asr_deployment = WhisperDeployment.options(
num_replicas=1,
ray_actor_options={"num_gpus": 0.25}, # Remove this line if you want to run Whisper on a CPU.
user_config=WhisperConfig(
model_size=WhisperModelSize.MEDIUM,
compute_type=WhisperComputeType.FLOAT16,
).model_dump(mode="json"),
)
deployments = [{"name": "asr_deployment", "instance": asr_deployment}]
# Define the endpoint to transcribe the video.
class TranscribeVideoEndpoint(Endpoint):
"""Transcribe video endpoint."""
async def initialize(self):
"""Initialize the endpoint."""
self.asr_handle = await AanaDeploymentHandle.create("asr_deployment")
await super().initialize()
async def run(self, video: VideoInput) -> WhisperOutput:
"""Transcribe video."""
video_obj = await run_remote(download_video)(video_input=video)
audio = extract_audio(video=video_obj)
transcription = await self.asr_handle.transcribe(audio=audio)
return transcription
endpoints = [
{
"name": "transcribe_video",
"path": "/video/transcribe",
"summary": "Transcribe a video",
"endpoint_cls": TranscribeVideoEndpoint,
},
]
aana_app = AanaSDK(name="transcribe_video_app")
for deployment in deployments:
aana_app.register_deployment(**deployment)
for endpoint in endpoints:
aana_app.register_endpoint(**endpoint)
if __name__ == "__main__":
aana_app.connect(host="127.0.0.1", port=8000, show_logs=False) # Connects to the Ray cluster or starts a new one.
aana_app.migrate() # Runs the migrations to create the database tables.
aana_app.deploy(blocking=True) # Deploys the application.
You have a few options to run the application:
app.py
, and run it as a Python script: python app.py
.app.py
, and run it using the Aana CLI: aana deploy app:aana_app --host 127.0.0.1 --port 8000 --hide-logs
.Once the application is running, you will see the message Deployed successfully.
in the logs. You can now send a request to the application to transcribe a video.
To get an overview of the Ray cluster, you can use the Ray Dashboard. The Ray Dashboard is available at http://127.0.0.1:8265
by default. You can see the status of the Ray cluster, the resources used, running applications and deployments, logs, and more. It is a useful tool for monitoring and debugging your applications. See Ray Dashboard documentation for more information.
Let's transcribe Gordon Ramsay's perfect scrambled eggs tutorial using the application.
curl -X POST http://127.0.0.1:8000/video/transcribe -Fbody='{"video":{"url":"https://www.youtube.com/watch?v=VhJFyyukAzA"}}'
This will return the full transcription of the video, transcription for each segment, and transcription info like identified language. You can also use the Swagger UI to send the request.
We provide a few example applications that demonstrate the capabilities of Aana SDK.
See the README files of the applications for more information on how to install and run them.
The full list of example applications is available in the Aana Examples repository. You can use these examples as a starting point for building your own applications.
There are three main components in Aana SDK: deployments, endpoints, and AanaSDK.
Deployments are the building blocks of Aana SDK. They represent the machine learning models that you want to deploy. Aana SDK comes with a set of predefined deployments that you can use or you can define your own deployments. See Integrations for more information about predefined deployments.
Each deployment has a main class that defines it and a configuration class that allows you to specify the deployment parameters.
For example, we have a predefined deployment for the Whisper model that allows you to transcribe audio. You can define the deployment like this:
from aana.deployments.whisper_deployment import WhisperDeployment, WhisperConfig, WhisperModelSize, WhisperComputeType
asr_deployment = WhisperDeployment.options(
num_replicas=1,
ray_actor_options={"num_gpus": 0.25},
user_config=WhisperConfig(model_size=WhisperModelSize.MEDIUM, compute_type=WhisperComputeType.FLOAT16).model_dump(mode="json"),
)
See Model Hub for a collection of configurations for different models that can be used with the predefined deployments.
Endpoints define the functionality of your application. They allow you to connect multiple deployments (models) to each other and define the input and output of your application.
Each endpoint is defined as a class that inherits from the Endpoint
class. The class has two main methods: initialize
and run
.
For example, you can define an endpoint that transcribes a video like this:
class TranscribeVideoEndpoint(Endpoint):
"""Transcribe video endpoint."""
async def initialize(self):
"""Initialize the endpoint."""
await super().initialize()
self.asr_handle = await AanaDeploymentHandle.create("asr_deployment")
async def run(self, video: VideoInput) -> WhisperOutput:
"""Transcribe video."""
video_obj = await run_remote(download_video)(video_input=video)
audio = extract_audio(video=video_obj)
transcription = await self.asr_handle.transcribe(audio=audio)
return transcription
AanaSDK is the main class that you use to build your application. It allows you to deploy the deployments and endpoints you defined and start the application.
For example, you can define an application that transcribes a video like this:
aana_app = AanaSDK(name="transcribe_video_app")
aana_app.register_deployment(name="asr_deployment", instance=asr_deployment)
aana_app.register_endpoint(
name="transcribe_video",
path="/video/transcribe",
summary="Transcribe a video",
endpoint_cls=TranscribeVideoEndpoint,
)
aana_app.connect() # Connects to the Ray cluster or starts a new one.
aana_app.migrate() # Runs the migrations to create the database tables.
aana_app.deploy() # Deploys the application.
All you need to do is define the deployments and endpoints you want to use in your application, and Aana SDK will take care of the rest.
The Serve Config Files is the recommended way to deploy and update your applications in production. Aana SDK provides a way to build the Serve Config Files for the Aana applications. See the Serve Config Files documentation on how to build and deploy the applications using the Serve Config Files.
You can deploy example applications using Docker. See the documentation on how to run Aana SDK with Docker.
For more information on how to use Aana SDK, see the documentation.
Aana SDK is licensed under the Apache License 2.0. Commercial licensing options are also available.
We welcome contributions from the community to enhance Aana SDK's functionality and usability. Feel free to open issues for bug reports, feature requests, or submit pull requests to contribute code improvements.
Check out the Development Documentation for more information on how to contribute.
We have adopted the Contributor Covenant as our code of conduct.
FAQs
Multimodal SDK
We found that aana demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Research
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Research
Security News
Attackers used a malicious npm package typosquatting a popular ESLint plugin to steal sensitive data, execute commands, and exploit developer systems.
Security News
The Ultralytics' PyPI Package was compromised four times in one weekend through GitHub Actions cache poisoning and failure to rotate previously compromised API tokens.