
Security News
MCP Community Begins Work on Official MCP Metaregistry
The MCP community is launching an official registry to standardize AI tool discovery and let agents dynamically find and install MCP servers.
This project provides a real-time speech-to-text translation system built on a modular server–client architecture.
The program can be used both as a command-line tool or as a Python API in other applications, with full support for non-blocking and asynchronous workflows.
Before running the project, you need to install the following system dependencies:
sudo apt-get install portaudio19-dev ffmpeg
(RECOMMENDED): install this package inside a virtual environment to avoid dependency conflicts.
python -m venv .venv
source .venv/bin/activate
Install the PyPI package:
pip install live-translation
Verify the installation:
python -c "import live_translation; print(f'live-translation installed successfully\n{live_translation.__version__}')"
NOTE: One can safely ignore similar warnings that might appear on Linux systems when running the client as it tries to open the mic:
ALSA lib pcm_dsnoop.c:567:(snd_pcm_dsnoop_open) unable to open slave ALSA lib pcm_dmix.c:1000:(snd_pcm_dmix_open) unable to open slave ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.rear ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.center_lfe ALSA lib pcm.c:2722:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.side ALSA lib pcm_dmix.c:1000:(snd_pcm_dmix_open) unable to open slave Cannot connect to server socket err = No such file or directory Cannot connect to server request channel jack server is not running or cannot be started JackShmReadWritePtr::~JackShmReadWritePtr - Init not done for -1, skipping unlock JackShmReadWritePtr::~JackShmReadWritePtr - Init not done for -1, skipping unlock
server can be run directly from the command line:
live-translate-server [OPTIONS]
[OPTIONS]
usage: live-translate-server [-h] [--silence_threshold SILENCE_THRESHOLD] [--vad_aggressiveness {0,1,2,3,4,5,6,7,8,9}] [--max_buffer_duration {5,6,7,8,9,10}] [--device {cpu,cuda}]
[--whisper_model {tiny,base,small,medium,large,large-v2,large-v3,large-v3-turbo}] [--trans_model {Helsinki-NLP/opus-mt,Helsinki-NLP/opus-mt-tc-big}]
[--src_lang SRC_LANG] [--tgt_lang TGT_LANG] [--log {print,file}] [--ws_port WS_PORT] [--transcribe_only] [--version]
Live Translation Server - Configure runtime settings.
options:
-h, --help show this help message and exit
--silence_threshold SILENCE_THRESHOLD
Number of consecutive 32ms silent chunks to detect SILENCE.
SILENCE clears the audio buffer for transcription/translation.
NOTE: Minimum value is 16.
Default is 65 (~ 2s).
--vad_aggressiveness {0,1,2,3,4,5,6,7,8,9}
Voice Activity Detection (VAD) aggressiveness level (0-9).
Higher values mean VAD has to be more confident to detect speech vs silence.
Default is 8.
--max_buffer_duration {5,6,7,8,9,10}
Max audio buffer duration in seconds before trimming it.
Default is 7 seconds.
--device {cpu,cuda} Device for processing ('cpu', 'cuda').
Default is 'cpu'.
--whisper_model {tiny,base,small,medium,large,large-v2,large-v3,large-v3-turbo}
Whisper model size ('tiny', 'base', 'small', 'medium', 'large', 'large-v2', 'large-v3', 'large-v3-turbo).
NOTE: Running large models like 'large-v3', or 'large-v3-turbo' might require a decent GPU with CUDA support for reasonable performance.
NOTE: large-v3-turbo has great accuracy while being significantly faster than the original large-v3 model. see: https://github.com/openai/whisper/discussions/2363
Default is 'base'.
--trans_model {Helsinki-NLP/opus-mt,Helsinki-NLP/opus-mt-tc-big}
Translation model ('Helsinki-NLP/opus-mt', 'Helsinki-NLP/opus-mt-tc-big').
NOTE: Don't include source and target languages here.
Default is 'Helsinki-NLP/opus-mt'.
--src_lang SRC_LANG Source/Input language for transcription (e.g., 'en', 'fr').
Default is 'en'.
--tgt_lang TGT_LANG Target language for translation (e.g., 'es', 'de').
Default is 'es'.
--log {print,file} Optional logging mode for saving transcription output.
- 'file': Save each result to a structured .jsonl file in ./transcripts/transcript_{TIMESTAMP}.jsonl.
- 'print': Print each result to stdout.
Default is None (no logging).
--ws_port WS_PORT WebSocket port the of the server.
Used to listen for client audio and publishe output (e.g., 8765).
--transcribe_only Transcribe only mode. No translations are performed.
--version Print version and exit.
client can be run directly from the command line:
live-translate-client [OPTIONS]
[OPTIONS]
usage: live-translate-client [-h] [--server SERVER] [--version]
Live Translation Client - Stream audio to the server.
options:
-h, --help show this help message and exit
--server SERVER WebSocket URI of the server (e.g., ws://localhost:8765)
--version Print version and exit.
You can also import and use live_translation directly in your Python code. The following is simple examples of running live_translation's server and client in a blocking fashion. For more detailed examples showing non-blocking and asynchronous workflows, see examples/.
NOTE: The examples below assumes the live_translation package has been installed as shown in the Installation.
NOTE: One can run a provided example in examples/ as
python -m examples.<example_name>
. For example:python -m examples.magic_word
Running the example this way from inside the repository assume a development environment has been set up, see ## Development & Contribution in the next section.
Server
from live_translation import LiveTranslationServer, ServerConfig
def main():
config = ServerConfig(
device="cpu",
ws_port=8765,
log="print",
transcribe_only=False,
)
server = LiveTranslationServer(config)
server.run(blocking=True)
# Main guard is CRITICAL for systems that uses spawn method to create new processes
# This is the case for Windows and MacOS
if __name__ == "__main__":
main()
Client
from live_translation import LiveTranslationClient, ClientConfig
def parser_callback(entry, *args, **kwargs):
"""Callback function to parse the output from the server.
Args:
entry (dict): The message from the server.
*args: Optional positional args passed from the client.
**kwargs: Optional keyword args passed from the client.
"""
print(f"📝 {entry['transcription']}")
print(f"🌍 {entry['translation']}")
# Returning True signals the client to shutdown
return False
def main():
config = ClientConfig(server_uri="ws://localhost:8765")
client = LiveTranslationClient(config)
client.run(
callback=parser_callback,
callback_args=(), # Optional: positional args to pass
callback_kwargs={}, # Optional: keyword args to pass
blocking=True,
)
if __name__ == "__main__":
main()
To contribute or modify this project, these steps might be helpful:
NOTE: This workflow below is developed with Linux-based systems with typical build tools installed e.g. Make in mind. One might need to install Make and possibly other tools on other systems. However, one can still do things manually without Make, for example, run test manually using
python -m pytest -s tests/
instead ofmake test
. See Makefile for more details.
Fork & Clone the repository:
git clone git@github.com:<your-username>/live-translation.git
cd live-translation
Ceate a virtual environment:
python -m venv .venv
source .venv/bin/activate
Install Dependencies:
pip install --upgrade pip
pip install -r requirements.txt
Test the package:
make test
Build the package:
make build
NOTE: Building does lint and checks for formatting using ruff. One can do that seprately using
make format
andmake lint
. For linting and formatting rules, see the ruff config.
NOTE: Building generates a .whl file that can be pip installed in a new environment for testing
If needed, run the server and the client within the virtual environment:
python -m live_translation.server.cli [OPTIONS]
python -m live_translation.client.cli [OPTIONS]
For contribution:
This project was tested and developed on the following system configuration:
requirements.txt
and Prerequisitessrc_lang
and tgt_lang
as it's currently done.large-v3-turbo
Whisper model"). This will help with hardware requirements and deployment decisions. @article{Whisper,
title = {Robust Speech Recognition via Large-Scale Weak Supervision},
url = {https://arxiv.org/abs/2212.04356},
author = {Radford, Alec and Kim, Jong Wook and Xu, Tao and Brockman, Greg and McLeavey, Christine and Sutskever, Ilya},
publisher = {arXiv},
year = {2022}
}
@misc{Silero VAD,
author = {Silero Team},
title = {Silero VAD: pre-trained enterprise-grade Voice Activity Detector (VAD), Number Detector and Language Classifier},
year = {2021},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/snakers4/silero-vad}},
email = {hello@silero.ai}
}
@article{tiedemann2023democratizing,
title={Democratizing neural machine translation with {OPUS-MT}},
author={Tiedemann, J{\"o}rg and Aulamo, Mikko and Bakshandaeva, Daria and Boggia, Michele and Gr{\"o}nroos, Stig-Arne and Nieminen, Tommi and Raganato, Alessandro and Scherrer, Yves and Vazquez, Raul and Virpioja, Sami},
journal={Language Resources and Evaluation},
number={58},
pages={713--755},
year={2023},
publisher={Springer Nature},
issn={1574-0218},
doi={10.1007/s10579-023-09704-w}
}
@InProceedings{TiedemannThottingal:EAMT2020,
author = {J{\"o}rg Tiedemann and Santhosh Thottingal},
title = {{OPUS-MT} — {B}uilding open translation services for the {W}orld},
booktitle = {Proceedings of the 22nd Annual Conference of the European Association for Machine Translation (EAMT)},
year = {2020},
address = {Lisbon, Portugal}
}
CUDA as the DEVICE
is probably needed for heavier models like large-v3-turbo
for Whisper. Nvidia drivers, CUDA Toolkit, cuDNN installation needed if option "cuda"
was to be used. ↩
FAQs
A real-time translation tool using Whisper & Opus-MT
We found that live-translation demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
The MCP community is launching an official registry to standardize AI tool discovery and let agents dynamically find and install MCP servers.
Research
Security News
Socket uncovers an npm Trojan stealing crypto wallets and BullX credentials via obfuscated code and Telegram exfiltration.
Research
Security News
Malicious npm packages posing as developer tools target macOS Cursor IDE users, stealing credentials and modifying files to gain persistent backdoor access.