Sofya Transcription
Sofya Transcription is a JavaScript library that provides a robust and flexible solution for real-time audio transcription. It is designed to transcribe audio streams and can be easily integrated into web applications. The library also includes a functionality for capturing audio from media elements.
Features
- Real-Time Transcription: Transcribe audio streams in real time with high accuracy.
- Flexible Integration: Seamlessly integrates with your web applications.
- Media Element Audio Capture: Feature to capture audio from media elements like
<video> and <audio>.
- Multiple Provider Support: Support for Sofya Compliance and Sofya as Service transcription providers.
- Type-Safe Configuration: TypeScript definitions for provider-specific configurations.
Installation
To install Sofya Transcription, you can use npm:
npm install sofya.transcription
Usage
Here's a basic example of how to use Sofya Transcription in your project:
-
Import the Library:
import { MediaElementAudioCapture, SofyaTranscriber } from 'sofya.transcription';
-
Create a Transcription Service Instance:
const transcriber = new SofyaTranscriber({
apiKey: 'YOUR_API_KEY',
config: {
language: 'en-US'
}
});
const transcriber = new SofyaTranscriber({
provider: 'sofya_compliance',
endpoint: 'YOUR_ENDPOINT',
config: {
language: 'en-US',
token: 'YOUR_TOKEN',
compartmentId: 'YOUR_COMPARTMENT_ID',
region: 'YOUR_REGION'
}
});
-
Initialize and Start Transcription:
transcriber.on('ready', () => {
navigator.mediaDevices.getUserMedia({ audio: true })
.then(mediaStream => {
transcriber.startTranscription(mediaStream);
})
.catch(error => {
console.error('Error accessing microphone:', error);
});
});
-
Handle Transcription Events:
transcriber.on('recognizing', (text) => {
console.log('Recognizing: ' + text);
});
transcriber.on('recognized', (text) => {
console.log('Recognized: ' + text);
});
transcriber.on('error', (error) => {
console.error('Transcription error:', error);
});
transcriber.on('stopped', () => {
console.log('Transcription stopped');
});
-
Control Transcription:
transcriber.pauseTranscription();
transcriber.resumeTranscription();
await transcriber.stopTranscription();
API
SofyaTranscriber
-
constructor(connection: Connection): Creates a new instance of the transcription service with a connection object.
-
startTranscription(mediaStream: MediaStream): void: Starts the transcription process with a given MediaStream.
-
stopTranscription(): void: Stops the transcription process.
-
pauseTranscription(): void: Pauses the transcription process.
-
resumeTranscription(): void: Resumes the transcription process.
-
on(event: string, callback: Function): this: Registers an event handler for transcription events. Possible events include:
recognizing: Fired when transcription is in progress.
recognized: Fired when transcription is complete.
error: Fired when an error occurs.
ready: Fired when the transcription service is ready to start.
stopped: Fired when the transcription process is stopped.
connected: Fired when the transcription service is connected to the provider.
Connection Types
The SDK supports different connection modes based on the provider:
API Key Connection
{
apiKey: string;
config?: BaseConfig;
}
Sofya Compliance Provider Connection
{
provider: "sofya_compliance";
endpoint: string;
config: SofyaComplianceConfig;
}
Sofya As Service Provider Connection
{
provider: "sofya_as_service";
endpoint: string;
config: SofyaSpeechConfig;
}
STT WVAD Provider Connection
{
provider: "stt_wvad";
endpoint: string;
config: SofyaSpeechConfig;
}
Configuration Types
BaseConfig
interface BaseConfig {
language: string;
}
SofyaComplianceConfig
interface SofyaComplianceConfig extends BaseConfig {
token: string;
compartmentId: string;
region: string;
}
SofyaSpeechConfig
interface SofyaSpeechConfig extends BaseConfig {}
React Example
import React from 'react'
import { SofyaTranscriber } from 'sofya.transcription'
const App = () => {
const transcriberRef = React.useRef<SofyaTranscriber | null>(null)
const [transcription, setTranscription] = React.useState('')
const transcriptionRef = React.useRef('')
const getMediaStream = async () => {
const stream = await navigator.mediaDevices.getUserMedia({ audio: true })
return stream
}
const startTranscription = async () => {
try {
const stream = await getMediaStream()
const transcriber = new SofyaTranscriber({
apiKey: 'your_api_key',
config: {
language: 'en-US'
}
})
transcriberRef.current = transcriber
transcriber.on("ready", () => {
transcriber.startTranscription(stream)
})
transcriber.on('recognizing', (result: string) => {
transcriptionRef.current = result
setTranscription(result)
})
transcriber.on('recognized', (result: string) => {
transcriptionRef.current = result
setTranscription(result)
})
transcriber.on('error', (error: Error) => {
console.error('Transcription error:', error)
})
} catch (error) {
console.error('Error starting transcription:', error)
}
}
const stopTranscription = async () => {
if (transcriberRef.current) {
await transcriberRef.current.stopTranscription()
}
}
return (
<div>
<button onClick={startTranscription}>Start Transcription</button>
<button onClick={stopTranscription}>Stop Transcription</button>
<div>
<h3>Transcription:</h3>
<p>{transcription}</p>
</div>
</div>
)
}
export default App
License
This project is licensed under the MIT License - see the LICENSE file for details.