Security News
pnpm 10.0.0 Blocks Lifecycle Scripts by Default
pnpm 10 blocks lifecycle scripts by default to improve security, addressing supply chain attack risks but sparking debate over compatibility and workflow changes.
augnito-rn-sdk
Advanced tools
Use the Augnito React Native SDK to enable Text To Speech and voice commands into a React Native application.
Use the Augnito React Native SDK to enable Text To Speech and voice commands into a React Native application.
Microphone access/recording permission is required in order to work.
Add the following line to AndroidManifest.xml
<uses-permission android:name="android.permission.RECORD_AUDIO"/>
Add the following to Info.plist:
<key>NSMicrophoneUsageDescription</key>
<string>Microphone access required</string>
The Augnito React Native SDK has to be installed by running:
npm install augnito-rn-sdk
cd ios
pod install
Note on M1 family processor: refer to environment setup to solve compatibility issues for pods
The majority of the functionality goes through the DictationManager class via the use of callbacks.
To instantiate a DictationManager object an AugnitoConfig must be provided with valid authorization and keys provided.
import {
DictationManager,
AugnitoApiServer,
AugnitoConfig,
TextUtils,
ActionRecipe,
} from 'augnito-rn-sdk';
const augnitoConfig: AugnitoConfig = new AugnitoConfig(
'<your server>',
'<your accountcode>',
'<your accesskey>',
'<your lmid>',
'<your usertag>',
'<your sourceapp>'
);
And on an initialization method
const dictationManager: DictationManager = useMemo(
() => DictationManager.fromConfig(augnitoConfig),
[]
);
Alternatively it can be initialized with a provided connection URL:
const dictationManager: DictationManager = useMemo(
() => DictationManager.fromCustomUrl('custom url'),
[]
);
The DictationManager is used to initialize, receive, and stop communication with Augnito's server to provide Speech To Text and Commands support into a React Native app.
The way it works is via methods and callbacks.
Logging can be enabled via the DictationManager:
const enableLogging = true;
...
const dictationManager: DictationManager = useMemo(
() => DictationManager.fromConfig(augnitoConfig, enableLogging),
[]
);
method | notes |
---|---|
toggleDictation | Turns on or off the audio processing and server communication. |
property | notes |
---|---|
isBusy | Indicates the Dictation Manager is currently performing an operation. |
Callbacks are used to receive the output of the SDK as well as errors that may occur. The callbacks are provided via the DictationManager constructor as named optional parameters.
callback | notes |
---|---|
onConnected | Invoked when the connection to the server has been established. This does not guarantees the audio stream has started. |
onDisconnected | Invoked when the server closes connection. It can be triggered on purpose by the DictationManager under certain conditions such as being unable to start audio streaming. |
onError | Invoked when an error occurs. Not all errors invoke a connection termination. |
onPartialResult | Invoked when an hypothesis or non processed final text has been obtained. |
onFinalResult | Invoked when a final text output (not command) has been processed. |
onCommandResult | Invoked when a command has been processed. |
Example usage
dictationManager.onConnected = useCallback(() => {
setIsRecording(true);
setIsLoading(false);
setTitle(titleListening);
}, []);
dictationManager.onDisconnected = useCallback(() => {
setIsRecording(false);
setIsLoading(false);
setTitle(titleStart);
}, []);
Dictation Manager will report back on errors via the onError callback. Besides the method itself an object with further details is provided.
The errorType property is the best way to determine what could be happening underneath. Possible types:
property | notes |
---|---|
noNetworkConnection | |
lowBandwidth | |
serviceDown | |
noDictationStopMic | |
invalidAuthCredentials | |
socketDisconnect | |
socketConnectionError | |
audioRecorder | |
audioRecorderCouldNotInitializePermission | |
audioRecorderCouldNotInitializeGeneric | |
unknown |
Even further information will be provided on the errorMessage property when possible.
A note on noDictationStopMic: this will not terminate the connection but it's and indicator the microphone is open but idle. The consumer app may want to warn the user or terminate the session if this keeps happening.
Commands are represented by the ActionRecipe object. It includes the information required to be processed by the consumer app.
Commands are returned by the SDK via the onCommandResult callback (DictationManager).
As stated the result of a command (either Static or Dynamic) will be an instance of an ActionRecipe.
When a command is returned by the SDK the consumer needs to analyze it and determine what to do based on the properties of the ActionRecipe instance.
For example, a Static Command resulting ActionRecipe will always have it's isStaticCommand property set as true.
Other fields such as searchText, chooseNumber or selectFor are used to determine what to do with the command and in what context. For instance a Replace X with Y dynamic command will produce an ActionRecipe like:
// replace oranges with apples
{
name: 'replace', // AugnitoCommands.replace entry
isCommand: true,
searchText: 'oranges',
selectFor: 'apples'
...
}
property | notes |
---|---|
name | Command name. Can be matched against one of the entries of AugnitoCommands class. |
action | |
chooseNumber | Unit: on commands that affect items it indicates how many. |
searchText | On most select/action commands this represents the item to be searched. For example paragraph, line, word, etc. On some commands this indicates what is being searched for. |
selectFor | Usually used to determine an action on most commands. For example delete, underline, etc. |
isCommand | Indicates the ActionRecipe is a command. |
isStaticCommand | Indicates this ActionRecipe is a static command. |
nextPrevious | Direction on how this command should affect the items. |
receivedText | Original received text. |
receivedTextWithoutSpace | Original text without spaces. |
Commands are meant to be used on their on and not in conjunction with regular dictation.
For example:
Dictating Select next 3 words will produce a command
Dictating Patient presents with fever and chills select last 2 words will only produce a transcription
Commands can be matched with entries on the AugnitoCommands dictionary. In the previous example the name equals to AugnitoCommands.replace.
Commands that have not effect on specific units or items are denominated as static commands.
Microphone Control Static Commands |
---|
stopMic |
Selection Static Commands |
---|
selectAll |
selectChar |
selectWord |
selectLine |
selectNextLine |
selectPreviousLine |
selectParagraph |
Lists Static Commands |
---|
startNumberList |
startBulletList |
stopBulletList |
stopNumberList |
stopList |
Action Static Commands |
---|
undoIt |
redoIt |
Text Modification Commands |
---|
startBoldText |
stopBoldText |
startBulletText |
stopBulletText |
startItalicText |
stoptItalicText |
Modification Commands |
---|
deleteIt |
pasteIt |
copyIt |
cutIt |
headerIt |
underlineIt |
boldIt |
italicizeIt |
capitalizeIt |
bulletIt |
Navigation Static Commands |
---|
moveUp |
moveDown |
moveLeft |
moveRight |
goToLineStart |
goToLineEnd |
giveSpace |
backspace |
goToDocumentEnd |
goToDocumentStart |
goToNextPage |
goToPreviousPage |
goToNextParagraph |
goTo |
Field Navigation Static Commands |
---|
nextField |
previousField |
Dynamic commands are more complex commands built based on the speech input and may contain extra information (always within the ActionRecipe itself).
Commands that affect a line or paragraph number via an action command such as select, bold, delete, etc.
^(select|choose|copy text|copytext|cut text|cuttext|correct|bold|underline|delete|header|capitalize|unbold|debold|dbold|uncapitalize|remove|capitalise|dcapitalise|decapitalicize|decapitalize|uncapitalize|Uncap|d capitalise|d capitalize|d underline|dunderline|deunderline|ununderline|goto|moveto|move|italicize|italicise|unitalicise|unitalicize)(the)?(line|para|paragraph|\n\n|\n)(number[ed]?)(.*?)(\.)?$
A correctly parsed Line and Paragraph number will have a structure similar to:
{
name: AugnitoCommands.goToLineNumber // or AugnitoCommands.goToParaNumber
isCommand: true,
chooseNumber: 10,
searchText: "line",
selectFor: "goToStart"
...
}
Example voice command
Simple Go To new line or new paragraph command.
{
name: AugnitoCommands.selectLine // or AugnitoCommands.selectParagraph
isCommand: true,
searchText: "line",
selectFor: "gotoend"
...
}
Example voice command
Action against the active line, paragraph, space, etc.
^(select|choose|copy text|copytext|cut text|cuttext|correct|bold|underline|delete|header|capitalize|unbold|debold|dbold|uncapitalize|remove|capitalise|dcapitalise|decapitalicize|decapitalize|uncapitalize|Uncap|d capitalise|d capitalize|d underline|dunderline|deunderline|ununderline|goto|moveto|move|italicize|italicise|unitalicise|unitalicize)(the)?(line|para|paragraph|\n\n|\n)(number[ed]?)(.*?)(\.)?$
Example voice command
{
name: AugnitoCommands.selectActiveChar
isCommand: true,
...
}
{
name: AugnitoCommands.selectActiveWord
isCommand: true,
...
}
{
name: AugnitoCommands.selectActiveLine
isCommand: true,
...
}
{
name: AugnitoCommands.selectActiveParagraph
isCommand: true,
...
}
Processes a direction plus a combination of item and distance.
^(last|previous|next|down)(.*?)(word[s]?|line[s]?|sentence[s]?|paragraph[s]?|para[s]?|char[s]?|character[s]?|space|\n\n|\n)$
Example voice commands
Processor to identify movement with direction and unit.
^(go|goto|gotothe|move|moveto|movethe)(last|previous|next|down)(.*?)(word[s]?|line[s]?|sentence[s]?|paragraph[s]?|para[s]?|char[s]?|character[s]?|space|\n\n|\n)$
Example voice commands
// go last 4 word
{
name: AugnitoCommands.selectWord
isCommand: true,
chooseNumber: 4,
nextPrevious: 'last',
selectFor: 'gotostart'
...
}
// go to last space
{
name: AugnitoCommands.selectChar
isCommand: true,
chooseNumber: 0,
nextPrevious: 'last',
selectFor: 'gotostart'
...
}
// move the previous 4 words
{
name: AugnitoCommands.selectWord
isCommand: true,
chooseNumber: 4,
nextPrevious: 'previous',
selectFor: 'gotostart'
...
}
Identifies an action on an item that includes direction and unit amount.
^(select|choose|copy text|copytext|cut text|cuttext|correct|bold|underline|delete|header|capitalize|unbold|debold|dbold|uncapitalize|remove|capitalise|dcapitalise|decapitalicize|decapitalize|uncapitalize|Uncap|d capitalise|d capitalize|d underline|dunderline|deunderline|ununderline|goto|moveto|move|italicize|italicise|unitalicise|unitalicize)(the)?(last|previous|next)(.*?)(word[s]?|line[s]?|sentence[s]?|paragraph[s]?|para[s]?|char[s]?|character[s]?|space|\n\n|\n\ns|\n|\ns)$
Example voice commands
// choose the next 10 characters
{
name: AugnitoCommands.selectChar
isCommand: true,
chooseNumber: 10,
nextPrevious: 'next',
searchText: 'characters',
selectFor: ''
...
}
// underline previous new line
{
name: AugnitoCommands.selectLine
isCommand: true,
chooseNumber: 1,
nextPrevious: 'previous',
searchText: '\n',
selectFor: 'underline'
...
}
// capitalize the next 5 words
{
name: AugnitoCommands.selectWord
isCommand: true,
chooseNumber: 5,
nextPrevious: 'next',
searchText: 'words',
selectFor: 'capitalize'
...
}
// delete previous 5 paragraphs
{
name: AugnitoCommands.selectParagraph
isCommand: true,
chooseNumber: 5,
nextPrevious: 'previous',
searchText: '\n\n',
selectFor: 'delete'
...
}
Identifies the replace operation between two elements.
replace[d]?([A-Z a-z 0-9]+)with([A-Z a-z 0-9]+)
Example voice commands
// replace oranges with apples
{
name: AugnitoCommands.replace
isCommand: true,
searchText: 'oranges',
selectFor: 'apples'
...
}
Identifies a Select/Action command.
^(select|choose|copy text|copytext|cut text|cuttext|correct|bold|underline|delete|header|capitalize|unbold|debold|dbold|uncapitalize|remove|capitalise|dcapitalise|decapitalicize|decapitalize|uncapitalize|Uncap|d capitalise|d capitalize|d underline|dunderline|deunderline|ununderline|goto|moveto|move|italicize|italicise|unitalicise|unitalicize)(\sthe)?\s?(.*?)$
Example voice commands
// delete mistaken
{
name: AugnitoCommands.select
isCommand: true,
searchText: 'mistaken',
selectFor: 'delete'
...
}
The SDK includes a support class used to handle basic operations on the TextInput control.
The methods included for the processor are:
method | notes |
---|---|
processFinalResult | based on the current selection status, existing text and received text returns the final state of the text and where the cursor should be placed. |
selectLastLines | based on the current text, cursor position and amount of desired amount of lines the method returns a range indicating where in the string the selection starts and ends. |
selectLastWords | based on the current text, cursor position and amount of desired amount of words the method returns a range indicating where in the string the selection starts and ends. |
The usage of these methods is shown on the example app:
const processSelectWords = (
text: string,
caretPosition: number,
words: number,
selectFor: string,
updateStateFunction: (value: React.SetStateAction<string>) => void
) => {
const range = TextUtils.selectLastWords(text, words, caretPosition);
if (selectFor === Commands.Delete) {
const finalText =
text.substring(0, range.start) + text.substring(range.end);
setRequiresUpdateNativePropSelection(true);
updateStateFunction(finalText);
setCurrentSelection({
start: range.start,
end: range.start,
});
} else {
setRequiresUpdateNativePropSelection(true);
setCurrentSelection({
start: range.start,
end: range.end,
});
}
};
In order to keep the app working on the background the info.plist requires the following entries:
<key>UIBackgroundModes</key>
<array>
<string>audio</string>
<string>fetch</string>
</array>
The SDK includes an app that shows how to use the SDK on basic text controls. The app is located on the /example directory.
The app requires the same environment setup as for any React Native app. The configuration for each supported OS and Platform is displayed on the official page.
Besides the regular setup, packages and and a local .env file must be present. On the example directory:
npm install
For the .env the .env.example can be cloned and renamed. Later, each parameter needs to be set to the one provided.
Once cloned or downloaded go to the Example directory and if the environment is properly configured follow the regular steps:
npx react-native start
npx react-native run-ios
FAQs
Use the Augnito React Native SDK to enable Text To Speech and voice commands into a React Native application.
The npm package augnito-rn-sdk receives a total of 5 weekly downloads. As such, augnito-rn-sdk popularity was classified as not popular.
We found that augnito-rn-sdk demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 0 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
pnpm 10 blocks lifecycle scripts by default to improve security, addressing supply chain attack risks but sparking debate over compatibility and workflow changes.
Product
Socket now supports uv.lock files to ensure consistent, secure dependency resolution for Python projects and enhance supply chain security.
Research
Security News
Socket researchers have discovered multiple malicious npm packages targeting Solana private keys, abusing Gmail to exfiltrate the data and drain Solana wallets.