
Security News
Crates.io Implements Trusted Publishing Support
Crates.io adds Trusted Publishing support, enabling secure GitHub Actions-based crate releases without long-lived API tokens.
GPTSpeak is a tool designed to convert text into high-quality audio files using OpenAI's Text-to-Speech (TTS) API. It can be used both as a command-line interface (CLI) tool and as a Python module. GPTSpeak provides functionality to play the generated audio files directly and concatenate multiple audio files, making it an ideal solution for developers, writers, educators, and content creators who need to transform written content into spoken audio efficiently.
With GPTSpeak, you can:
It's as simple as:
gptspeak "Hello, world!"
Install GPTSpeak using pip:
pip install gptspeak
Ensure you have the following installed:
ffmpeg is required for audio processing. You can check if you already have it installed by running ffmpeg -version
in your terminal/command prompt.
macOS (with Homebrew):
brew install ffmpeg
Ubuntu/Debian:
sudo apt-get install ffmpeg
Windows:
bin/
directory from the extracted folder to your system PATH.ffmpeg -version
.After installing ffmpeg, you should be ready to use GPTSpeak.
Here's how you can quickly get started with GPTSpeak:
# Set your API key
export OPENAI_API_KEY="your-openai-api-key"
# Convert direct text input to audio
gptspeak "Hello, world!" -o hello.mp3
# Play direct text input without saving to a file (turn your audio up)
gptspeak "Hello, world!" -o hello.mp3
# Convert a text file to audio
gptspeak convert input.txt -o output.mp3
# Play the generated audio file
gptspeak play output.mp3
# Concatenate multiple audio files
gptspeak concat -o combined.mp3 file1.mp3 file2.mp3
Before using GPTSpeak, set up the API key for OpenAI by setting the appropriate environment variable:
export OPENAI_API_KEY="your-openai-api-key"
You can set this variable in your shell profile (~/.bashrc
, ~/.zshrc
, etc.) or include it in your Python script before importing GPTSpeak.
Note: Keep your API key secure and do not expose it in code repositories.
Here's an example of how to use GPTSpeak in your Python code:
import os
from pathlib import Path
from gptspeak.core.converter import convert_text_to_speech, convert_text_to_speech_direct
from gptspeak.core.player import play_audio
# Set the OpenAI API key
os.environ["OPENAI_API_KEY"] = "your-openai-api-key"
# Convert text file to speech
input_file = Path("input.txt")
output_file = Path("output.mp3")
model = "tts-1"
voice = "alloy"
convert_text_to_speech(input_file, output_file, model, voice)
print(f"Audio file created: {output_file}")
# Convert direct text input to speech
text = "Hello, world!"
direct_output_file = Path("hello.mp3")
convert_text_to_speech_direct(text, direct_output_file, model, voice)
print(f"Audio file created: {direct_output_file}")
# Play the generated audio
play_audio(output_file)
GPTSpeak allows you to configure default settings for the TTS model and voice. You can do this interactively or by specifying the options directly:
# Interactive configuration
gptspeak configure
# Set default model and voice directly
gptspeak configure -m tts-1-hd -v nova
The configuration is saved in ~/.gptspeak.ini
and will be used for future conversions unless overridden by command-line options.
When using the command-line interface, ensure you've set the appropriate environment variable for the OpenAI API key.
To convert direct text input to audio:
gptspeak "Hello, world!" -o hello.mp3 -m tts-1 -v alloy
To convert a text file to audio:
gptspeak convert input.txt -o output.mp3 -m tts-1 -v alloy
To concatenate multiple audio files:
gptspeak concat -o combined.mp3 file1.mp3 file2.mp3 file3.mp3
To play an audio file:
gptspeak play output.mp3
For direct text input and the convert
command:
-o
, --output
: Output audio file path (default: speech.mp3)-m
, --model
: TTS model to use (default: tts-1)-v
, --voice
: Voice to use for speech (default: alloy)For the play
command:
For the concat
command:
-o
, --output
: Output audio file path (default: concatenated.mp3)For the configure
command:
-m
, --model
: Set the default TTS model-v
, --voice
: Set the default voiceGPTSpeak supports the following models and voices from OpenAI's TTS API:
tts-1
: Standard TTS modeltts-1-hd
: High-definition TTS modelalloy
: Neutral voiceecho
: Soft and gentle voicefable
: Expressive and dynamic voiceonyx
: Deep and authoritative voicenova
: Warm and friendly voiceshimmer
: Clear and energetic voiceTo use a specific model and voice, specify them in the command:
gptspeak "Hello, world!" -o hello.mp3 -m tts-1-hd -v nova
gptspeak "Welcome to GPTSpeak!" -o welcome.mp3 -v fable
This command will convert the text "Welcome to GPTSpeak!" into an audio file named welcome.mp3
using the "fable" voice.
gptspeak convert story.txt -o narration.mp3 -v echo
This command will convert the content of story.txt
into an audio file named narration.mp3
using the "echo" voice.
gptspeak convert long_story.txt -o long_narration.mp3 -v nova
This command will convert the content of long_story.txt
into an audio file named long_narration.mp3
using the "nova" voice. If the text is longer than the API's character limit, GPTSpeak will automatically split it into chunks, process each chunk separately, and combine the results into a single audio file.
gptspeak concat -o full_audiobook.mp3 chapter1.mp3 chapter2.mp3 chapter3.mp3
This command will combine chapter1.mp3
, chapter2.mp3
, and chapter3.mp3
into a single audio file named full_audiobook.mp3
.
gptspeak play narration.mp3
This command will play the narration.mp3
file.
Contributions to GPTSpeak are welcome! If you'd like to contribute, please follow these steps:
Please ensure that your code adheres to the existing style conventions and passes all tests.
GPTSpeak is licensed under the Apache-2.0 License. See LICENSE for more information.
FAQs
Text-to-speech CLI tool and Python library using OpenAI's TTS API
We found that gptspeak demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Crates.io adds Trusted Publishing support, enabling secure GitHub Actions-based crate releases without long-lived API tokens.
Research
/Security News
Undocumented protestware found in 28 npm packages disrupts UI for Russian-language users visiting Russian and Belarusian domains.
Research
/Security News
North Korean threat actors deploy 67 malicious npm packages using the newly discovered XORIndex malware loader.