IMG.LY AI Audio Generation for Web
A plugin for integrating AI audio generation capabilities into CreativeEditor SDK.
Overview
The @imgly/plugin-ai-audio-generation-web
package enables users to generate audio content using AI directly within CreativeEditor SDK. This shipped provider leverages the ElevenLabs platform to provide high-quality text-to-speech and sound effect generation.
Features include:
- Text-to-speech generation with multiple voices
- Sound effect generation from text descriptions
- Voice selection interface
- Speed adjustment
- Automatic history tracking
- Seamless integration with CreativeEditor SDK
Installation
npm install @imgly/plugin-ai-audio-generation-web
Usage
Basic Configuration
To use the plugin, import it and configure it with your preferred providers:
import CreativeEditorSDK from '@cesdk/cesdk-js';
import AudioGeneration from '@imgly/plugin-ai-audio-generation-web';
import Elevenlabs from '@imgly/plugin-ai-audio-generation-web/elevenlabs';
CreativeEditorSDK.create(domElement, {
license: 'your-license-key'
}).then(async (cesdk) => {
cesdk.addPlugin(
AudioGeneration({
text2speech: Elevenlabs.ElevenMultilingualV2({
proxyUrl: 'https://your-elevenlabs-proxy.example.com'
}),
text2sound: Elevenlabs.ElevenSoundEffects({
proxyUrl: 'https://your-elevenlabs-proxy.example.com'
}),
debug: false,
dryRun: false
})
);
});
Providers
The plugin comes with two pre-configured providers for ElevenLabs:
1. ElevenMultilingualV2 (Text-to-Speech)
A versatile text-to-speech engine that supports multiple languages and voices:
text2speech: Elevenlabs.ElevenMultilingualV2({
proxyUrl: 'https://your-elevenlabs-proxy.example.com'
});
Key features:
- Multiple voice options
- Multilingual support
- Adjustable speaking speed
- Natural-sounding speech
2. ElevenSoundEffects (Text-to-Sound)
A sound effect generator that creates audio from text descriptions:
text2sound: Elevenlabs.ElevenSoundEffects({
proxyUrl: 'https://your-elevenlabs-proxy.example.com'
});
Key features:
- Generate sound effects from text descriptions
- Create ambient sounds, effects, and music
- Seamless integration with CreativeEditor SDK
- Automatic thumbnails and duration detection
Configuration Options
The plugin accepts the following configuration options:
text2speech | Provider | Provider for text-to-speech generation | undefined |
text2sound | Provider | Provider for sound effect generation | undefined |
debug | boolean | Enable debug logging | false |
dryRun | boolean | Simulate generation without API calls | false |
middleware | Function | Custom middleware for the generation process | undefined |
Using a Proxy
For security reasons, it's recommended to use a proxy server to handle API requests to ElevenLabs. The proxy URL is required when configuring providers:
text2speech: Elevenlabs.ElevenMultilingualV2({
proxyUrl: 'https://your-elevenlabs-proxy.example.com'
});
You'll need to implement a proxy server that forwards requests to ElevenLabs and handles authentication.
API Reference
Main Plugin
AudioGeneration(options: PluginConfiguration): EditorPlugin
Creates and returns a plugin that can be added to CreativeEditor SDK.
Plugin Configuration
interface PluginConfiguration {
text2speech?: AiAudioProvider;
text2sound?: AiAudioProvider;
debug?: boolean;
dryRun?: boolean;
middleware?: GenerationMiddleware;
}
ElevenLabs Providers
ElevenMultilingualV2
Elevenlabs.ElevenMultilingualV2(config: {
proxyUrl: string;
debug?: boolean;
}): AiAudioProvider
ElevenSoundEffects
Elevenlabs.ElevenSoundEffects(config: {
proxyUrl: string;
debug?: boolean;
}): AiAudioProvider
UI Integration
The plugin automatically registers the following UI components:
- Speech Generation Panel: A sidebar panel for text-to-speech generation
- Sound Generation Panel: A sidebar panel for generating sound effects
- Voice Selection Panel: A panel for choosing different voice options
- History Library: Displays previously generated audio clips
Panel IDs
- Main speech panel:
ly.img.ai/elevenlabs/monolingual/v1
- Main sound panel:
ly.img.ai/elevenlabs/sound-generation
- Voice selection panel:
ly.img.ai/audio-generation/speech/elevenlabs.voiceSelection
Asset History
Generated audio files are automatically stored in asset sources with the following IDs:
- Text-to-Speech:
elevenlabs/monolingual/v1.history
- Sound Effects:
elevenlabs/sound-generation.history
Related Packages
License
This plugin is part of the IMG.LY plugin ecosystem for CreativeEditor SDK. Please refer to the license terms in the package.