ElevenLabs Haystack Integration
This repository contains an integration of ElevenLabs' Text-to-Speech API with Haystack pipelines. This package allows you to convert text to speech using ElevenLabs' API and optionally save the generated audio to AWS S3.
Installation
pip install elevenlabs_haystack
Setting Up API Keys
ElevenLabs API Key
To access the ElevenLabs API, you need to create an account and obtain an API key.
- Go to the ElevenLabs website and sign up for an account.
- Once logged in, navigate to the Profile section.
- In the API section, generate a new API key.
- Copy the API key.
AWS Credentials
To store generated audio files on AWS S3, you need AWS credentials (Access Key ID, Secret Access Key) and specify a region.
- If you don’t have an AWS account, sign up at AWS.
- Create a new IAM user and assign the necessary permissions to allow the user to upload files to S3. The
AmazonS3FullAccess
policy is sufficient for this example. - Once the IAM user is created, download or note the AWS Access Key ID and Secret Access Key.
- Identify the AWS Region where your S3 bucket resides (e.g.,
us-east-1
). This information can be found in the AWS Management Console. - Finally, create or identify the S3 bucket where the generated audio files will be saved.
Create a .env
file in the root directory with the following content (replace with your actual credentials):
ELEVENLABS_API_KEY=sk_your_elevenlabs_api_key_here
AWS_ACCESS_KEY_ID=your_aws_access_key_id
AWS_SECRET_ACCESS_KEY=your_aws_secret_access_key
AWS_REGION_NAME=us-east-1
AWS_S3_BUCKET_NAME=your_s3_bucket_name
These variables will be automatically loaded using dotenv
and used to access ElevenLabs and AWS services securely.
Usage
Basic Text-to-Speech Example
This example shows how to use the ElevenLabsTextToSpeech
component to convert text to speech and save the generated audio file locally or in an AWS S3 bucket. It uses environment variables to access sensitive credentials.
from haystack.utils import Secret
from elevenlabs_haystack import ElevenLabsTextToSpeech
tts = ElevenLabsTextToSpeech(
elevenlabs_api_key=Secret.from_env_var("ELEVENLABS_API_KEY"),
output_folder="audio_files",
voice_id="Xb7hH8MSUJpSbSDYk0k2",
aws_s3_bucket_name=Secret.from_env_var("AWS_S3_BUCKET_NAME"),
aws_s3_output_folder="s3_files",
aws_access_key_id=Secret.from_env_var("AWS_ACCESS_KEY_ID"),
aws_secret_access_key=Secret.from_env_var("AWS_SECRET_ACCESS_KEY"),
aws_region_name=Secret.from_env_var("AWS_REGION_NAME"),
voice_settings={
"stability": 0.75,
"similarity_boost": 0.75,
"style": 0.5,
"use_speaker_boost": True,
},
)
result = tts.run("Hello, world!")
print(result)
"""
{
"id": "elevenlabs-id",
"file_name": "audio_files/elevenlabs-id.mp3",
"s3_file_name": "s3_files/elevenlabs-id.mp3",
"s3_bucket_name": "test-bucket",
"s3_presigned_url": "https://test-bucket.s3.amazonaws.com/s3_files/elevenlabs-id.mp3"
}
"""
Example Using Haystack Pipeline
This example demonstrates how to integrate the ElevenLabsTextToSpeech
component into a Haystack pipeline. Additionally, we define a WelcomeTextGenerator
component that generates a personalized welcome message.
from haystack import component, Pipeline
from haystack.utils import Secret
from elevenlabs_haystack import ElevenLabsTextToSpeech
@component
class WelcomeTextGenerator:
"""
A component generating a personal welcome message and making it upper case.
"""
@component.output_types(welcome_text=str, note=str)
def run(self, name: str):
return {
"welcome_text": f'Hello {name}, welcome to Haystack!'.upper(),
"note": "welcome message is ready"
}
text_pipeline = Pipeline()
text_pipeline.add_component(
name="welcome_text_generator",
instance=WelcomeTextGenerator()
)
text_pipeline.add_component(
name="tts",
instance=ElevenLabsTextToSpeech(
elevenlabs_api_key=Secret.from_env_var("ELEVENLABS_API_KEY"),
output_folder="audio_files",
voice_id="Xb7hH8MSUJpSbSDYk0k2",
aws_s3_bucket_name=Secret.from_env_var("AWS_S3_BUCKET_NAME"),
aws_s3_output_folder="s3_files",
aws_access_key_id=Secret.from_env_var("AWS_ACCESS_KEY_ID"),
aws_secret_access_key=Secret.from_env_var("AWS_SECRET_ACCESS_KEY"),
aws_region_name=Secret.from_env_var("AWS_REGION_NAME"),
voice_settings={
"stability": 0.75,
"similarity_boost": 0.75,
"style": 0.5,
"use_speaker_boost": True,
},
),
)
text_pipeline.connect(sender="welcome_text_generator.welcome_text", receiver="tts")
result = text_pipeline.run({"welcome_text_generator": {"name": "Bilge"}})
print(result)
"""
{
"id": "elevenlabs-id",
"file_name": "audio_files/elevenlabs-id.mp3",
"s3_file_name": "s3_files/elevenlabs-id.mp3",
"s3_bucket_name": "test-bucket",
"s3_presigned_url": "https://test-bucket.s3.amazonaws.com/s3_files/elevenlabs-id.mp3"
}
"""
License
This project is licensed under the MIT License.