Research
Security News
Malicious npm Package Targets Solana Developers and Hijacks Funds
A malicious npm package targets Solana developers, rerouting funds in 2% of transactions to a hardcoded address.
azure-speech-utilities
Advanced tools
Provides a convenient abstraction layer over the Microsoft Cognitive Services Speech SDK, simplifying the integration of speech-to-text functionality into client applications. Using this npm package, developers can quickly integrate speech-to-text capabil
Provides a convenient abstraction layer over the Microsoft Cognitive Services Speech SDK, simplifying the integration of speech-to-text and text-to-speech functionality into client/browser applications. Using this package, developers can quickly integrate basic STT and TTS capabilities into their applications without the need to write intricate code.
Using npm:
npm install azure-speech-utilities
Creates a new speech recognizer instance.
Parameter | Type | Default Value | Description |
---|---|---|---|
cogSvcSubKey | string | "" | The Cognitive Services subscription key for Speech Services. (Required, default is empty string ) |
cogSvcRegion | string | "" | The region of Cognitive Services subscription. (Required, default is empty string ) |
recognitionLang | string[] | ["en-US"] | An array of language recognition codes. (Optional, default is ["en-US"] ). |
Used for single-shot recognition, which recognizes a single utterance. The end of a single utterance is determined by listening for silence at the end or until a maximum of 15 seconds of audio is processed.
Parameter | Type | Default Value | Description |
---|---|---|---|
recognizer | sdk.SpeechRecognizer | undefined | undefined | The speech recognizer instance to use. |
The previous function performs single-shot recognition, which recognizes a single utterance. In contrast, you can use continuous recognition to get a real-time recognized text stream. Make a call to StopContinuousRecognitionAsync() at some point to stop recognition
Parameter | Type | Default Value | Description |
---|---|---|---|
recognizer | sdk.SpeechRecognizer | undefined | undefined | The speech recognizer instance to use. |
callbackRecognized | (value: string) => void | val => console.log(value) | A callback function called with recognized text. |
callbackRecognizing | (value: string) => void | val => console.log(value) | A callback function called while speech is being recognized. |
Stops ongoing continuous speech recognition.
Parameter | Type | Default Value | Description |
---|---|---|---|
recognizer | sdk.SpeechRecognizer | undefined | undefined | The speech recognizer instance to use. |
Note: Use the same recognizer instance which you are using for ContinuousRecognitionAsync() as an argument to this function.
Creates a new speech synthesizer instance.
Parameter | Type | Default Value | Description |
---|---|---|---|
cogSvcSubKey | string | "" | The Cognitive Services subscription key for Speech Services. (Required) |
cogSvcRegion | string | "" | The region of Cognitive Services subscription. (Required) |
synthesisLang | string | "" | The language code for the speech synthesizer. (Required) |
synthesisVoiceName | string | "" | The name of the voice to use for speech synthesis. (Optional, default is "" ) |
createAudioConfig | boolean | false | Whether to create an audio config for speech output. (Optional, default is false ) |
Note: The voice that speaks is determined in order of priority as follows:
createAudioConfig
, doesn't play the audio by default on the current active output device.synthesisLang
, the default voice for the specified locale speaks.synthesisVoiceName
and synthesisLang
are set, the synthesisLang
setting is ignored. The voice that you specify by using synthesisVoiceName
speaks.synthesisVoiceName
and synthesisLang
settings are ignored.Performs speech synthesis and returns the result (synthesized audio) in form of arrayBuffer.
Parameter | Type | Default Value | Description |
---|---|---|---|
synthesizer | sdk.SpeechSynthesizer | undefined | undefined | The speech synthesizer instance to use. |
inputString | string | "I'm excited to try text to speech" | The text to be synthesized. |
inputType | string | "text" | The format of the input text. (Optional, default is "text" ) |
callback | (result: sdk.SynthesisResult, error?: Error) => void | (result, error) => {} | A callback function called with the synthesis result or an error. |
Recognize Once
import { CreateRecognizer, RecognizeOnceAsync } from "azure-speech-utilities"
const CGV_KEY = "AZURE_SPEECH_SERVICE_KEY"
const CGV_REGION = "AZURE_SPEECH_SERVICE_REGION"
async function recognizeSpeech() {
const recognizer = CreateRecognizer(CGV_KEY, CGV_REGION, ["hi-IN"])
try {
const recognizedText = await RecognizeOnceAsync(recognizer)
if (recognizedText.type === "text") {
console.log(recognizedText.message)
} else {
console.log(recognizedText.message)
}
} catch (error) {
console.error(error)
}
}
Continuous Recognition
import { CreateRecognizer, ContinuousRecognitionAsync } from "azure-speech-utilities"
const CGV_KEY = "AZURE_SPEECH_SERVICE_KEY"
const CGV_REGION = "AZURE_SPEECH_SERVICE_REGION"
// As there are 2 or more recognition languages "hi-IN" and "en-US" so it will be multilingual recognition.
const recognizer = CreateRecognizer(CGV_KEY, CGV_REGION, ["hi-IN", "en-US"])
function callbackRecognized(text) {
console.log("RECOGNIZED: ", text)
}
function callbackRecognizing(text) {
console.log("RECOGNIZING: ", text)
}
async function recognizeSpeech() {
try {
const response = await ContinuousRecognitionAsync(recognizer, callbackRecognized, callbackRecognizing)
if (response.type === "success") {
console.log(response.message)
} else {
console.error(response.message)
}
} catch (error) {
console.error(error)
}
}
function stopContinuousRecognition() {
StopContinuousRecognitionAsync(recognizer)
}
Speak Async
import { CreateSynthesizer, SpeakAsync } from "azure-speech-utilities"
const CGV_KEY = "AZURE_SPEECH_SERVICE_KEY"
const CGV_REGION = "AZURE_SPEECH_SERVICE_REGION"
const SYNTHESIS_LANGUAGE = "en-US"
const SYNTHESIS_VOICE_NAME = "en-US-JennyNeural"
function handleSpeck() {
// By default, the input type is 'text.' If you change the input type to 'ssml,' then the input string should be in the following SSML format.
// const ssml = `
// <speak version="1.0" xmlns="https://www.w3.org/2001/10/synthesis" xml:lang="${SYNTHESIS_LANGUAGE}">
// <voice name="${SYNTHESIS_VOICE_NAME}">
// When you're on the freeway, it's a good idea to use a GPS.
// </voice>
// </speak>
// `
const text = "When you're on the freeway, it's a good idea to use a GPS."
// Please note that the 'createAudioConfig' is set to false, meaning audio will not play by default on the currently active output device.
const synthesizer = CreateSynthesizer(CGV_KEY, CGV_REGION, SYNTHESIS_LANGUAGE, SYNTHESIS_VOICE_NAME, false)
SpeakAsync(synthesizer, text, "text", (result, error) => {
if (error) {
console.error(error)
} else {
console.log(result)
const audioBlob = new Blob([result.audioData], { type: "audio/wav" })
// You can use this URL as an audio source, which allows easy user control such as starting, stopping, resetting, etc.
console.log(URL.createObjectURL(audioBlob))
}
})
}
const stopSpeaking = () => {
audioRef.current.pause()
}
Note: If you do not wish to play audio through an audio source, you can set createAudioConfig
to true
. This will cause the audio to play on the current active output device by default. However, using this method will not provide the user with the ability to reset, play, or pause the speaking audio.
This project welcomes contributions and suggestions.
FAQs
Provides a convenient abstraction layer over the Microsoft Cognitive Services Speech SDK, simplifying the integration of speech-to-text functionality into client applications. Using this npm package, developers can quickly integrate speech-to-text capabil
The npm package azure-speech-utilities receives a total of 1 weekly downloads. As such, azure-speech-utilities popularity was classified as not popular.
We found that azure-speech-utilities demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
Security News
A malicious npm package targets Solana developers, rerouting funds in 2% of transactions to a hardcoded address.
Security News
Research
Socket researchers have discovered malicious npm packages targeting crypto developers, stealing credentials and wallet data using spyware delivered through typosquats of popular cryptographic libraries.
Security News
Socket's package search now displays weekly downloads for npm packages, helping developers quickly assess popularity and make more informed decisions.