Research
Security News
Malicious npm Packages Inject SSH Backdoors via Typosquatted Libraries
Socket’s threat research team has detected six malicious npm packages typosquatting popular libraries to insert SSH backdoors.
A wrapper around Espeak and Mbrola, to do simple Text-To-Speech (TTS), with the possibility to tweak the phonemic form.
A wrapper around Espeak and Mbrola.
This is a lightweight Python wrapper for Espeak and Mbrola, two co-dependent TTS tools. It enables you to render sound by simply feeding it text and voice parameters. Phonemes (the data transmitted by Espeak to mbrola) can also be manipulated using a mimalistic API.
This is a short introduction, but you might want to look at the readthedoc documentation.
These instructions should work on any Debian/Ubuntu-derivative
Install with pip as:
pip install voxpopuli
You have to have espeak and mbrola installed beforehand:
sudo apt install mbrola espeak
You'll also need some mbrola voices installed, which you can either get on their project page,
and then uppack in /usr/share/mbrola/<lang><voiceid>/
or more simply by
installing them from the ubuntu repo's. All the voices' packages are of the form
mbrola-<lang><voiceid>
. You can even more simply install all the voices available
by running:
sudo apt install mbrola-*
In case the voices you need aren't all in the ubuntu repo's, you can use this convenient little script that install voices directly from Mbrola's voice repo:
# this installs all british english and french voices for instance
sudo python3 -m voxpopuli.voice_install en fr
The most simple usage of this lib is just bare TTS, using a voice and a text. The rendered audio is returned in a .wav bytes object:
from voxpopuli import Voice
voice = Voice(lang="fr")
wav = voice.to_audio("salut c'est cool")
Evaluating type(wav)
whould return bytes
. You can then save the wav using the wb
file option
with open("salut.wav", "wb") as wavfile:
wavfile.write(wav)
If you wish to hear how it sounds right away, you'll have to make sure you installed pyaudio via pip, and then do:
voice.say("Salut c'est cool")
Ou can also, say, use scipy to get the pcm audio as a ndarray
:
import scipy.io.wavfile import read, write
from io import BytesIO
rate, wave_array = read(BytesIO(wav))
reversed = wave_array[::-1] # reversing the sound file
write("tulas.wav", rate, reversed)
You can set some parameters you can set on the voice, such as language or pitch
from voxpopuli import Voice
# really slow fice with high pitch
voice = Voice(lang="us", pitch=99, speed=40, voice_id=2)
voice.say("I'm high on helium")
The exhaustive list of parameters is:
listvoices
method from a Voice
instance.To render a string of text to audio, the Voice object actually chains espeak's output
to mbrola, who then renders it to audio. Espeak only renders the text to a list of
phonemes (such as the one in the IPA), who then are to be processed by mbrola.
For those who like pictures, here is a diagram of what happens when you run
voice.to_audio("Hello world")
phonemes are represented sequentially by a code, a duration in milliseconds, and a list of pitch modifiers. The pitch modifiers are a list of couples, each couple representing the percentage of the sample at which to apply the pitch modification and the pitch.
Funny thing is, with voxpopuli, you can "intercept" that phoneme list as a simple object, modify it, and then pass it back to the voice to render it to audio. For instance, let's make a simple alteration that'll double the duration for each vowels in an english text.
from voxpopuli import Voice, BritishEnglishPhonemes
voice = Voice(lang="en")
# here's how you get the phonemes list
phoneme_list = voice.to_phonemes("Now go away or I will taunt you a second time.")
for phoneme in phoneme_list: #phoneme list object inherits from the list object
if phoneme.name in BritishEnglishPhonemes.VOWELS:
phoneme.duration *= 3
# rendering and saving the sound, then saying it out loud:
voice.to_audio(phoneme_list, "modified.wav")
voice.say(phoneme_list)
Notes:
BritishEnglishPhonemes
class as above.FAQs
A wrapper around Espeak and Mbrola, to do simple Text-To-Speech (TTS), with the possibility to tweak the phonemic form.
We found that voxpopuli demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
Security News
Socket’s threat research team has detected six malicious npm packages typosquatting popular libraries to insert SSH backdoors.
Security News
MITRE's 2024 CWE Top 25 highlights critical software vulnerabilities like XSS, SQL Injection, and CSRF, reflecting shifts due to a refined ranking methodology.
Security News
In this segment of the Risky Business podcast, Feross Aboukhadijeh and Patrick Gray discuss the challenges of tracking malware discovered in open source softare.