
Product
Introducing Scala and Kotlin Support in Socket
Socket now supports Scala and Kotlin, bringing AI-powered threat detection to JVM projects with easy manifest generation and fast, accurate scans.
🍱 Build model inference APIs and multi-model serving systems with any open-source or custom AI models. 👉 Join our Slack community!
BentoML is a Python library for building online serving systems optimized for AI apps and model inference.
Install BentoML:
# Requires Python≥3.9
pip install -U bentoml
Define APIs in a service.py
file.
import bentoml
@bentoml.service(
image=bentoml.images.Image(python_version="3.11").python_packages("torch", "transformers"),
)
class Summarization:
def __init__(self) -> None:
import torch
from transformers import pipeline
device = "cuda" if torch.cuda.is_available() else "cpu"
self.pipeline = pipeline('summarization', device=device)
@bentoml.api(batchable=True)
def summarize(self, texts: list[str]) -> list[str]:
results = self.pipeline(texts)
return [item['summary_text'] for item in results]
Install PyTorch and Transformers packages to your Python virtual environment.
pip install torch transformers # additional dependencies for local run
Run the service code locally (serving at http://localhost:3000 by default):
bentoml serve
You should expect to see the following output.
[INFO] [cli] Starting production HTTP BentoServer from "service:Summarization" listening on http://localhost:3000 (Press CTRL+C to quit)
[INFO] [entry_service:Summarization:1] Service Summarization initialized
Now you can run inference from your browser at http://localhost:3000 or with a Python script:
import bentoml
with bentoml.SyncHTTPClient('http://localhost:3000') as client:
summarized_text: str = client.summarize([bentoml.__doc__])[0]
print(f"Result: {summarized_text}")
Run bentoml build
to package necessary code, models, dependency configs into a Bento - the standardized deployable artifact in BentoML:
bentoml build
Ensure Docker is running. Generate a Docker container image for deployment:
bentoml containerize summarization:latest
Run the generated image:
docker run --rm -p 3000:3000 summarization:latest
BentoCloud provides compute infrastructure for rapid and reliable GenAI adoption. It helps speed up your BentoML development process leveraging cloud compute resources, and simplify how you deploy, scale and operate BentoML in production.
Sign up for BentoCloud for personal access; for enterprise use cases, contact our team.
# After signup, run the following command to create an API token:
bentoml cloud login
# Deploy from current directory:
bentoml deploy
For detailed explanations, read the Hello World example.
Check out the full list for more sample code and usage.
See Documentation for more tutorials and guides.
Get involved and join our Community Slack 💬, where thousands of AI/ML engineers help each other, contribute to the project, and talk about building AI products.
To report a bug or suggest a feature request, use GitHub Issues.
There are many ways to contribute to the project:
#bentoml-contributors
channel here.Thanks to all of our amazing contributors!
The BentoML framework collects anonymous usage data that helps our community improve the product. Only BentoML's internal API calls are being reported. This excludes any sensitive information, such as user code, model data, model names, or stack traces. Here's the code used for usage tracking. You can opt-out of usage tracking by the --do-not-track
CLI option:
bentoml [command] --do-not-track
Or by setting the environment variable:
export BENTOML_DO_NOT_TRACK=True
FAQs
BentoML: The easiest way to serve AI apps and models
We found that bentoml demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 4 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Product
Socket now supports Scala and Kotlin, bringing AI-powered threat detection to JVM projects with easy manifest generation and fast, accurate scans.
Application Security
/Security News
Socket CEO Feross Aboukhadijeh and a16z partner Joel de la Garza discuss vibe coding, AI-driven software development, and how the rise of LLMs, despite their risks, still points toward a more secure and innovative future.
Research
/Security News
Threat actors hijacked Toptal’s GitHub org, publishing npm packages with malicious payloads that steal tokens and attempt to wipe victim systems.