![Oracle Drags Its Feet in the JavaScript Trademark Dispute](https://cdn.sanity.io/images/cgdhsj6q/production/919c3b22c24f93884c548d60cbb338e819ff2435-1024x1024.webp?w=400&fit=max&auto=format)
Security News
Oracle Drags Its Feet in the JavaScript Trademark Dispute
Oracle seeks to dismiss fraud claims in the JavaScript trademark dispute, delaying the case and avoiding questions about its right to the name.
Convert any provided text input into voice using OpenAI or Google text-to-speech models
AI-Read-It is a Node.js module that utilizes text-to-speech models, including OpenAI's, to convert text input into natural-sounding voice. This module is designed for easy integration into projects requiring text-to-speech functionality and supports multiple providers.
To use AI-Read-It in your project, follow these simple steps:
Install Node.js on your machine if you haven't already.
Install the module using npm:
npm install ai-read-it
The library provides two ways how to work with it, in other words – two interfaces: functional and object-oriented.
To start, in both cases you need to onfigure the library with your API key and optionally specify a provider (default is OpenAI
).
Functional interface works as singleton. This approach works if you are connecting to a single provider using a single API key in your application. It fits most of the useceses, but a simple explample when it does not work is allowing clients of your accplication to provide their API keys and/or choosing providers.
Functional interface is simpler to use. Just import the library initialize it and use on of the provided functions to convert
your text to speech. smallTextToSpeech
function is used in the following example:
const aiReadIt = require('ai-read-it');
const textToConvert = "Hello, world! This is AI-Read-It in action.";
// Initialize with provider name (OpenAI is the default provider)
aiReadIt.init(process.env.API_KEY); // or pass "Google" as the second argument for Google Cloud Text-to-Speech
aiReadIt.smallTextToSpeech(textToConvert)
.then(audioBuffer => {
// Handle the audio buffer (e.g., play it or save it to a file)
})
.catch(error => {
console.error("Error:", error);
});
Object-oriented interface provides ability to create multiple instances of AiReadIt
library in your appication.
In the following example we create Google
provider and use mediumTextToSpeech
to handle a bigger text:
const { AiReadIt, createProvider } = require('ai-read-it');
const textToConvert = "Hello, world! This is AI-Read-It in action. ".repeat(150);
// Initialize with provider name (OpenAI is the default provider)
const aiReadIt = new AiReadIt(createProvider("Google", process.env.GOOGLE_APPLICATION_CREDENTIALS_JSON)); // or "OpenAI" for OpenAI Text-to-Speech
aiReadIt.mediumTextToSpeech(textToConvert)
.then(audioBuffer => {
// Handle the audio buffer (e.g., play it or save it to a file)
})
.catch(error => {
console.error("Error:", error);
});
You can create provider with createProvider
fabric. This approach works unless you need to provide costom configuration options
to the provider (on top of the API key). In such case consider importing providers derectly as ai-read-it/providers/***Provider
.
:warning: Note the different import syntax in the examples above. In the functional case the whole library is refferenced
with the aiReadIt
variable while with object-oriented interface we import the AiReadIt
class directly along with the
provider fabric function createProvider
.
// Functional interface
const aiReadIt = require('ai-read-it');
// Functional interfaceObject-oriented interface
const { AiReadIt, createProvider } = require('ai-read-it');
Besides smallTextToSpeech
and mediumTextToSpeech
functions/methods, both interfaces provide largeTextToSpeech
function to convert larger texts in the streaming mode in chunks to get the first faster and not wating till the whole text is processed. See the API details below.
A CLI (Command Line Interface) tool for text-to-speech conversion is included. It takes text as input, processes it, and outputs the converted audio. Specify the provider with --provider
or -p
flag (the profiver is OpenAI
).
cat text-to-read.txt | ./bin/ai-read-it-cli.js --provider OpenAI > tts-audio.mp3
smallTextToSpeech(text: string, options = {}): Promise<Buffer>
mediumTextToSpeech(text: string, options = {}): Promise<Buffer>
largeTextToSpeech(text: string, options = {}): AsyncGenerator
For all the functions the options is an array of the following values:
options.model
- The model to use for the conversion (default: 'tts-1'): tts-1, tts-1-hd
options.voice
- The voice to use for the conversion (default: 'fable'): alloy, echo, fable, onyx, nova, shimmer
options.response_format
- The format of the response audio (default: 'mp3'): mp3, opus, aac, flac
options.speed
- The speed of the speech (default: 1.0
): 0.25 .. 4.0
largeTextToSpeech()
additionally support options.chunkSize
integer value from 1 till 4096 to set a maximum character limit
for each text chunk processed (up to 4096 characters). Note that smaller chunks lead to quicker initial responses, but increase
the number of requests sent per minute to the OpenAI API.
Check out the main.js
file in the project repository for a simple example of using AI-Read-It.
You can run the example:
OPENAI_API_KEY="___PUT_YOUR_OPENAI_API_KEY_HERE___" node main.js --provider OpenAI
Alternatively, save your key into a .env
file:
OPENAI_API_KEY="___PUT_YOUR_OPENAI_API_KEY_HERE___"
Then get the key from the file and export it:
source .env
export OPENAI_API_KEY
Run the application:
node main.js --provider OpenAI
AI-Read-It currently supports two text-to-speech providers:
Google's Text-to-Speech service converts text into natural-sounding speech using advanced deep learning techniques. It offers a wide range of voices and languages to choose from, allowing for highly customizable speech synthesis.
Configuration Options: Google Text-to-Speech supports various configuration options, including voice selection, speaking rate, and pitch adjustment.
Voices: A diverse set of voices across languages and dialects, including WaveNet voices for natural-sounding speech. More Details: For a comprehensive overview of the supported configuration options and voices, please visit the Google Cloud Text-to-Speech Documentation.
OpenAI's text-to-speech capabilities are designed to generate human-like speech from text inputs. It provides options to customize the voice, speed, and other aspects of speech synthesis to fit various applications.
Configuration Options: OpenAI allows customization of the voice model, speaking speed, and response format among others. Voices: OpenAI offers a selection of voices for different styles and use cases, ensuring versatility in speech generation. More Details: For detailed information on the configuration options and available voices, please refer to the OpenAI API Documentation.
If you encounter any issues or have suggestions for improvements, please open an issue. Contributions are also welcome!
This project is licensed under the MIT License.
FAQs
Convert any provided text input into voice using OpenAI or Google text-to-speech models
The npm package ai-read-it receives a total of 6 weekly downloads. As such, ai-read-it popularity was classified as not popular.
We found that ai-read-it demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Oracle seeks to dismiss fraud claims in the JavaScript trademark dispute, delaying the case and avoiding questions about its right to the name.
Security News
The Linux Foundation is warning open source developers that compliance with global sanctions is mandatory, highlighting legal risks and restrictions on contributions.
Security News
Maven Central now validates Sigstore signatures, making it easier for developers to verify the provenance of Java packages.