This package is based on a fork of web-speech-cognitive-services.
The primary goal is the use the SpeechSynthetizer from microsoft-cognitiveservices-speech-sdk in the TTS part of the package, in order to receive the boundaries and visemes on a speech synthesis to overcome the existing issues of the original package.
npm install @davi-ai/web-speech-cognitive-services-davi
Changes compared to original package
In order to use speech synthesis, you still need to use the original process :
- create a speechSynthesisPonyfill with your credentials, containing a speechSynthesis object :
- wait for the voices to be loaded
- create a SpeechSynthesisUtterance
- attach events to the utterance
- play the utterance
Use the imports from the new package with :
import { createSpeechSynthesisPonyfill } from '@davi-ai/web-speech-cognitive-services-davi/lib/SpeechServices'
import type { SpeechSynthesisUtterance } from '@davi-ai/web-speech-cognitive-services-davi/lib/SpeechServices'
You can now listen to the following events by attaching callbacks to the utterance :
- onsynthesisstart : fired when the synthesis starts
- onsynthesiscompleted : fired when the synthesis is completed
- onboundary : receive an event with the following data
name: string,
elapsedTime: number,
duration: number,
boundaryType: 'WordBoundary' | 'PunctuationBoundary' | 'Viseme'
This event is fired for each boundary and each viseme in the synthesis - onmark : receive an event with the following data
name: string,
elapsedTime: number
- onviseme : receive an event with the following data
name: string,
elapsedTime: number,
duration: 0,
boundaryType: 'Viseme'
This event is fired for each viseme in the synthesis.
(Viseme id documentation here) - examples :
utterance.onsynthesisstart = (): void => { 'Synthesis started !' }
utterance.onsynthesiscompleted = (): void => { 'Synthesis ended !' }
utterance.onboundary = (event): void => { console.log('Boundary data : ', event.boundaryType, event.name, event.elapsedTime, event.duration )}
Using the SpeechSynthetizer class leads to several improvements in the functionalities :
- the
event is now linked to the oncanplaythrough
event of the AudioElement used by the AudioContext. This allows a better synchronisation at the beginning of the speech. - you can call
and unmute()
on the ponyfill.speechSynthesis object anytime