Security News
tea.xyz Spam Plagues npm and RubyGems Package Registries
Tea.xyz, a crypto project aimed at rewarding open source contributions, is once again facing backlash due to an influx of spam packages flooding public package registries.
@ztiknl/sara
Advanced tools
Readme
<input> Hi Sara, how are you?
(prompt / response) All systems operational!
<input> How is the weather in (where are we)?
(prompt / response) The current weather in Amsterdam, Netherlands is:
(weatherdetails)
<input> _
This package is currently a work in progress
Do not install via npm install @ztik.nl/sara
Clone or download from Sara @ Github
Github documentation will be the Current/Latest testing build
NPM will be pushed occasionally when there shouldn't be any app-breaking bugs
Many changes are to be expected, do not expect backwards compatibility
Current version: 0.4.1
When the core program is more complete I will start semantic version 1.0.0
Sara is a command prompt, that listens for keyboard input or voice commands
Sara has a voice, and is able to respond to commands through text as well as audio
Sara is my (poor) attempt at making my own Jarvis/Alexa/Hey Google/Hi Bixby/Voice Response System
It runs in Node.js on a Raspberry Pi 3B, but should be able to run on earlier versions as well as other linux distro
It has some internal commands, but can be extended through a self-made plugin system
Hearing works
Voice commands can be sent to the command line for editing, or immediately be processed without user intervention
This option selection is currently hidden away in hearing.js, but will be in the commandline arguments and config.json soon
Voice works
Voice output works, but further testing is required
Different voices (male and female) are now possible, soon there will be an option to select, as well as a way to display a list of voices for each language!
Vision works
All it does is take a picture every 30 minutes using a USB webcam
Pi camera not supported yet, will be supported later
There are object/face detection functions, as well as some other functions (age/expression/gender labeling) but NONE of these functions are connected to the webcam source image yet!
There are NO object/face recognition functions at this moment, but this will be added soon
Sara ignores the following words at sentence start:
sara
can you
will you
would you
could you
tell me
let me know
please
Sara also ignores the word please
and the ?
character at the end of commands
After stripping these words, the command is compared to internal commands, and if it doesnt match, it will be compared to a regex string contained in every plugin .json file
Sara listens to the keyword 'Sara'
The hardware stated below is what I am using to build/test/run this project on
It should run on any linux distro, I didn't test but see no reason why it wouldn't
It might run on Mac OS,unable to verify, I do not have any Fruit branded devices
It doesn't run on Windows, Sonus (speech recognition via Google Cloud) doesn't run on Windows
Hardware:
Software:
sudo apt-get install alsa-utils
apt-get install fswebcam
Other:
"@google-cloud/text-to-speech": "^1.1.3",
"@google-cloud/translate": "^4.1.3",
"@google-cloud/vision": "^1.1.4",
"@tensorflow/tfjs-core": "^1.2.7",
"@tensorflow/tfjs-node": "^1.2.7",
"canvas": "^2.5.0",
"chalk": "^2.4.2",
"country-list": "^2.1.1",
"date-and-time": "^0.8.1",
"dav": "^1.8.0",
"decimal.js": "^10.2.0",
"face-api.js": "^0.20.1",
"geoip-lite": "^1.3.7",
"he": "^1.2.0",
"node-webcam": "^0.5.0",
"play-sound": "^1.1.3",
"public-ip": "^3.1.0",
"rollup": "^1.19.4",
"sonus": "^1.0.3",
"vcard-parser": "^1.0.0",
"weather-js2": "^2.0.2",
"weeknumber": "^1.1.1",
"wiki-entity": "^0.4.3"
npm install
node bin.js
node bin.js --help
For more information on the Google Cloud Speech API, see:
NPMJS.com/sonus/usage & NPMJS.com/sonus/how-do-i-set-up-google-cloud-speech-api
The Google API key file is located at ./resources/apikeys/googlecloud.json
For more information on how to setup your own custom hotword, see:
NPMJS.com/sonus/usage & NPMJS.com/sonus/how-do-i-make-my-own-hotword
The custom hotword file is located at ./resources/speechrecognition/Sarah.pmdl
I have tried to keep everything modular, so if something doesn't work on your system, you can disable that function through commandline arguments, config.json options file, or in the app itself
The vision command will be extended with object/face recognition, if I can when I get that to work properly
start/stop colors
turns on/off colored responses/prompt
start/stop verbose
turns on/off verbose mode
Verbose mode will turn on display of output with a 'data' or 'warn' type
start/stop bootstrap
turns on/off bootstrap plugins
bootstrap list
displays the currently active bootstrap plugins
help
displays the main 'help' section
list help
displays a list of all help topics
help <topic>
displays help on the topic requested (still needs to be populated)
help <plugin.function>
displays help on the requested plugin function (currently placeholders)
add help
fill in the form and a new help topic is born!
edit help <topic>
find an error in a certain help topic, you can fix it.
start/stop listening
turns on/off speech recognition
start/stop hearing
same as above
start/stop voice
turns on/off text-to-speech
start/stop talking
same as above
start/stop speaking
same as above
silence
stop speaking the current sentence/item
quiet
same as above
voice list
display a list of all voices for the current language (config.json)
list voice
same as above
voices list
same as above
list voices
same as above
start/stop vision
turns on/off timer (30 min) for webcam snapshot to ./resources/vision/frame.png
start/stop watching
same as above
Nothing is done with this image at this time, but there are tests being done with detection and recognition...
Sara needs to 'understand' commands, and does this by comparing input to a regular expression found inside each plugin function's .json file
Example:
^(?:what|how\smuch)?\s?(?:is)?\s?(-?[0-9]+\.?(?:[0-9]+)?)\s?(?:\+|plus|\&|and)\s?(-?[0-9]+\.?(?:[0-9]+)?)\s?(?:is)?$
This regular expression matches the following sentences:
what is (-)10(.12) plus/and/+/& (-)10(.12)
what (-)10(.12) plus/and/+/& (-)10(.12) is
how much is (-)10(.12) plus/and/+/& (-)10(.12)
how much (-)10(.12) plus/and/+/& (-)10(.12) is
(-)10(.12) plus/and/+/& (-)10(.12) is
(-)10(.12) plus/and/+/& (-)10(.12)
Because Sara strips starting input, this allows to recognize sentences such as:
Sara can you please tell me what 10 + -9 is?
In the above regex line. most groups are not captured (?:xxx)
The capture fields (-?[0-9]+.?(?:[0-9]+)?) grabs these values and push them back to math.js which includes the function for processing these values
In the above example, math.js will receive an array object containing 3(!) items:
[0] the complete input string, in case the plugin still requires this string.
[1] the first captured group
[2] the second captured group
Therefore, the function math.add will receive these 3 array items, and return the calculation of add x[1] + x[2]
x[0] is always the entire matching regex string
Using the input sentence above, then:
x[0] == "what 10 + -9 is"
x[1] == 10
x[2] == -9
(I am not a native English speaker, and I am not certain this is the correct term)
Sara is able to process subcommands through the use of parenthesis encapsulation
Example:
Sara can you tell me how much is 9 + (10 + 16)?
In this example, Sara will calculate 10 + 16 first, then calculate 9 + 26 afterwards
You can layer as many commands as you need, they will be processed starting with the most outer subcommand first:
11 + (7 + (root of 9))
subcmd: root of 9 = 3
subcmd: 7 + 3 = 10
finalcmd: 11 + 10 = 21
Some examples of what is possible:
((10 + ((root of 9) * (5³))) / 77) * (√9)
how is the weather in (where i am)
translate to german (what is gold)
These are created using (at least) 2 files:
pluginname_function.json
pluginname.js
The .js file contains all the javascript to deal with request X and push back a result
The result pushed back can be either a string such as '1999' (example question: 2000-1)
Or an array containing the text string, and the same string with SSML markup:
result = ['1999'];
result[1] = '<say-as interpret-as="cardinal">1999</say-as>';
More information on SSML markup can be found here
The .json file contains the name of the plugin, the name of the module (the .js file name), a Regular Expression string, a small description and explanation (used in help documentation)
math_add.json:
{
"name": "add",
"module": "math",
"regex": "^(?:what|how\\smuch)?\\s?(?:is)?\\s?(-?[0-9]+\\.?(?:[0-9]+)?)\\s?(?:\\+|plus|\\&|and)\\s?(-?[0-9]+\\.?(?:[0-9]+)?)\\s?(?:is)?$",
"description": "Add x to y",
"explanation": "This plugin teaches SARA how to calculate the sum of two numbers.\nType '-4 + 4.4' or one of the alternatives and SARA should respond."
}
One .js file can contain multiple module.exports functions, each function requires its own .json file
Example:
math.js
math_add.json
math_subtract.json
math_root.json
Regular Expressions in these .json files need special characters to be escaped twice:
"regex": "^(?:what|how\\smuch)?\\s?(?:is)?\\s?(-?[0-9]+\\.?(?:[0-9]+)?)\\s?(?:\\+|plus|\\&|and)\\s?(-?[0-9]+\\.?(?:[0-9]+)?)\\s?(?:is)?$",
Since Sara removes certain words from the start of the sentence, all that the regex requires is the intent and if variables need to be passed to the function, one or more working capture groups
The description and explanation are used in the help <plugin.function>
function
All commands listed are functional, although some plugins will require adding more commands (math.power, etc)
More plugins are coming, see Todo list for what I'd like to add (if possible)...
what is 7 + 9
10 - 3.3
9 * 4
4 divided by 3
how much is 12 squared
root of 10
what 10³ is
hi
hello
hey
yo
good morning/afternoon/evening/night
how are you
how are you doing
how are you feeling
how are you doing today
how are you feeling at the moment
when were you made
who made you
how were you made
why were you made
where am I
where are you
what city are we in
what time zone are we in
in which province are we
what are your actual coordinates
Which country is this
weather
how is the weather
how is the weather in/around/near <place>
what is the weather like in/around/near <place>
weather forecast
what is the weather forecast
what is the weather forecast for <place>
Add connection details to file plugins/xbmc-remote/connection.json (see example file connection_example.json)
stop video/movie/film/playback/episode
stop the video/movie/film/playback/episode
stop this video/movie/film/playback/episode
pause/pause video/movie/film/playback/episode
resume the video/movie/film/playback/episode
continue this video/movie/film/playback/episode
media menu select
media menu back
media menu move up/down/left/right
media menu move up/down/left/right 5x
media menu move up/down/left/right 5*
media menu move up/down/left/right 5 times
media menu move up/down/left/right 5 entries
media menu move up/down/left/right 1 entry
media menu home
media menu info/information
media menu context
media menu submenu
what time is it
what is the date
what year is it
what month is it
what day it is
what is the week number
what is <subject>
more about <subject>
Add newsapi,org api key to file plugins/news/newsorg.json (see example file newsorg_example.json)
news headlines
tech news headlines
news headlines from bbc-news
news headlines in US
news headlines on bitcoin
Add google cloud api key to file resources/apikeys/googlecloud.json (see example file googlecloud_example.json)
translate to french <input>
translate to english <input>
translate to dutch <input>
translate to german <input>
rock
paper
scissors
tictactoe
Add connection details to file plugins/addressbook/connection.json (see example file connection_example.json)
list address book
list address books
list addressbook
addressbook list
search contact <term>
Add connection details to file plugins/calendar/connection.json (see example file connection_example.json)
list calendar
list calendars
By adding .js files to the ./bootstrap folder, you can add background tasks (interval stuff such as data syncs) to SARA
These .js files require 3 module.exports, start() and stop() so that these tasks can be started/stopped, and a status() function which returns a true/false
Bootstrapping can be disabled by command line argument, voice/prompt command and config.json entry
Very basic example of a bootstrap file:
var calsyncactive = false;
module.exports = {
start: function() {
console.log('started calendar sync');
var caldaemon = setInterval(calsync,600000);
calsyncactive = true;
},
stop: function() {
console.log('stopped calendar sync');
clearInterval(caldaemon);
calsyncactive = false;
},
status: function() {
return calsyncactive;
}
}
function calsync() {
console.log('test');
}
The only advise I can give is to make sure that alsa has the correct in/output device registered
My Raspberry Pi config:
ztik@sara:~/ $ arecord -l
**** List of CAPTURE Hardware Devices ****
>>> card 0: haobosou [haobosou], device 0: USB Audio [USB Audio] <<<
Subdevices: 1/1
Subdevice #0: subdevice #0
card 1: HD4110 [HP Webcam HD-4110], device 0: USB Audio [USB Audio]
Subdevices: 1/1
Subdevice #0: subdevice #0
I use card 0, device 0 for my audio in (haobosou microphone, cheap and great quality audio)
ztik@sara:~/ $ aplay -l
**** List of PLAYBACK Hardware Devices ****
card 2: ALSA [bcm2835 ALSA], device 0: bcm2835 ALSA [bcm2835 ALSA]
Subdevices: 7/7
Subdevice #0: subdevice #0
Subdevice #1: subdevice #1
Subdevice #2: subdevice #2
Subdevice #3: subdevice #3
Subdevice #4: subdevice #4
Subdevice #5: subdevice #5
Subdevice #6: subdevice #6
>>> card 2: ALSA [bcm2835 ALSA], device 1: bcm2835 IEC958/HDMI [bcm2835 IEC958/HDMI] <<<
Subdevices: 1/1
Subdevice #0: subdevice #0
I use the HDMI output on my raspi for audio out, so I am using card 2, device 1 here
My config file:
ztik@sara:~/ $ cat ~/.asoundrc
pcm.!default {
type asym
playback.pcm {
type plug
slave.pcm "hw:2,1"
}
capture.pcm {
type plug
slave.pcm "hw:0,0"
}
}
This solved every issue I had with aplay and arecord
Using these settings I am able to record from the proper input device with the following command:
arecord -d 10 test.wav
and play that recording using:
aplay test.wav
Anything on support beyond this should be requested at alsa/linux forums I guess
Feel free to ask, but don't expect an answer...
I understand people can have problems getting through this, so here is a small guide
Set up a Cloud Project
To set that up, you'll need to create a new project in the Cloud Platform Console
1.1 Click on 'Create project' at the top menu
1.2 Enter a name and (optional) organisation
1.3 Click 'Create'
Enable billing for your project
Google FAQ on Billing
2.1 Click on the top left menu (three white dashes), and click the 'Billing' entry
2.2 Click on 'Create account' if you don't have any
Enable APIs you want to connect with
You don't have to use them all, if you don't want/need a certain module (voice/speech recognition/vision/translation), don't activate it
(All these modules can be deactivated, except translate which is a plugin)
Create project key file
4.1 Open the topleft menu again and go to APIs & Services and click on Credentials
(should lead you here)
4.2 Click on 'Create credentials' then on 'Service account key'
4.3 When prompted to create a new service account select 'Project' -> 'Owner'
4.4 As the service account, select your project and use JSON key type
4.5 Close the confirmation and a .json file should be downloaded
4.6 Copy this file, or its contents to ./resources/apikeys/googlecloud.json
or
4.6 Add a checkmark in front of the new Service account key, and click on 'Manage service accounts'
4.7 Click on your service account, and copy all the information to ./resources/apikeys/googlecloud.json
(or use googlecloud_example.json and rename file)
Some details on pricing:
Translate API pricing
At the time of writing, no free requests, $20 per 1 million characters (individual letters)
Speech API pricing
At the time of writing, 60 minutes free requests
Text-to-Speech API pricing
At the time of writing, 1 million characters (individual letters) free requests
Vision API pricing
At the time of writing, 1000 free requests per month
Apart from the Translation API, everything should be testable for free
But don't take my word for it, check the Billing page occasionally!
The microphone I use is a 'C-Media Haobosou G11 Touch Induction' and for a couple of days I have been having problems with it
When connecting the microphone, the blue power indicator would light up, and after 2 seconds it would turn off again
Pressing the touch induction area has no effect, and thus I am left with a disabled mic
It IS recognized by lsusb/hwinfo/arecord -l/dmesg but it is OFF
After three days of wrestling, I found the solution somewhere online (lost the url, no credits, sry)
ztik@sara:~/nodejs/sara $ amixer
Simple mixer control 'Mic',0
Capabilities: cvolume cvolume-joined cswitch cswitch-joined
Capture channels: Mono
Limits: Capture 0 - 62
Mono: Capture 53 [85%] [18.07dB] [off]
ztik@sara:~/nodejs/sara $ amixer set Mic 80% cap
ztik@sara:~/nodejs/sara $ amixer
Simple mixer control 'Mic',0
Capabilities: cvolume cvolume-joined cswitch cswitch-joined
Capture channels: Mono
Limits: Capture 0 - 62
Mono: Capture 50 [81%] [16.59dB] [on]
There is probably a better command for turning the mic on, but this also sets the recording volume at 80%, which is my personal preference
The vision module works, but all it does is take a picture every 30 min, no further processing connected at this moment
list help
command, and/or display all topics using help
--speak-all
, --speak-response-only
)I would like to point out that I simply put this hardware and these programs and modules together, but without the people who created those, I would have had nothing at all!
Thank you to those involved making:
Hope I didn't miss anyone here, if so, please let me know and I will update!
I am a complete moron when it comes to asynchronous programming, and I am positive that many functions could have been written better/cleaner/more efficient.
I made this project to enhance my understanding of Node.js/Javascript, so please remain calm if/when I don't understand your comment/code/bugfix/pull request/advice/issue at first glance.
FAQs
Sentient Artificial Responsive Agent
The npm package @ztiknl/sara receives a total of 12 weekly downloads. As such, @ztiknl/sara popularity was classified as not popular.
We found that @ztiknl/sara demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Tea.xyz, a crypto project aimed at rewarding open source contributions, is once again facing backlash due to an influx of spam packages flooding public package registries.
Security News
As cyber threats become more autonomous, AI-powered defenses are crucial for businesses to stay ahead of attackers who can exploit software vulnerabilities at scale.
Security News
UnitedHealth Group disclosed that the ransomware attack on Change Healthcare compromised protected health information for millions in the U.S., with estimated costs to the company expected to reach $1 billion.