You're Invited:Meet the Socket Team at BlackHat and DEF CON in Las Vegas, Aug 4-6.RSVP →

Book a Demo Install Sign in

tts-uk

Package Overview

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

tts-uk

High-fidelity speech synthesis for Ukrainian using modern neural networks.

1.3.7

PyPI

Maintainers: 1

Text-to-Speech for Ukrainian

High-fidelity speech synthesis for Ukrainian using modern neural networks.

Statuses

Demo

Check out our demo on Hugging Face space or just listen to samples here.

Features

Multi-speaker model: 2 female (Tetiana, Lada) + 1 male (Mykyta) voices;
Fine-grained control over speech parameters, including duration, fundamental frequency (F0), and energy;
High-fidelity speech generation using the RAD-TTS++ acoustic model;
Fast vocoding using Vocos;
Synthesizes long sentences effectively;
Supports a sampling rate of 44.1 kHz;
Tested on Linux environments and Windows/WSL;
Python API (requires Python 3.9 or later);
CUDA-enabled for GPU acceleration.

Installation

# Install from PyPI
pip install tts-uk

# OR, for the latest development version:
pip install git+https://github.com/egorsmkv/tts_uk

# OR, use git and local setup
git clone https://github.com/egorsmkv/tts_uk
cd tts_uk
uv sync # uv will handle the virtual environment

Read uv's installation section.

Also, you can download the repository as a ZIP archive.

Getting started

Code example:

import torchaudio

from tts_uk.inference import synthesis

sampling_rate = 44_100

# Perform the synthesis, `synthesis` function returns:
# - mels: Mel spectrograms of the generated audio.
# - wave: The synthesized waveform by a Vocoder as a PyTorch tensor.
# - stats: A dictionary containing synthesis statistics (processing time, duration, speech rate, etc).
mels, wave, stats = synthesis(
    text="Ви можете протестувати синтез мовлення українською мовою. Просто введіть текст, який ви хочете прослухати.",
    voice="tetiana",  # tetiana, mykyta, lada
    n_takes=1,
    use_latest_take=False,
    token_dur_scaling=1,
    f0_mean=0,
    f0_std=0,
    energy_mean=0,
    energy_std=0,
    sigma_decoder=0.8,
    sigma_token_duration=0.666,
    sigma_f0=1,
    sigma_energy=1,
)

print(stats)

# Save the generated audio to a WAV file.
torchaudio.save("audio.wav", wave.cpu(), sampling_rate, encoding="PCM_S")

Use these Google colabs:

CPU inference
GPU inference on T4 card (long document to synthesize)

Or run synthesis in a terminal:

uv run example.py

If you need to synthesize articles we recommend consider wtpsplit.

Get help and support

Please feel free to connect with us using the Issues section.

License

Code has the MIT license.

Model authors

Acoustic

Yehor Smoliakov, HF profile

Vocoder

Serhiy Stetskovych, HF profile

Community

Discord: https://bit.ly/discord-uds
Speech Recognition: https://t.me/speech_recognition_uk
Speech Synthesis: https://t.me/speech_synthesis_uk

Also, follow our Speech-UK initiative on Hugging Face!

Acknowledgements

Keywords

FAQs

What is tts-uk?

Is tts-uk well maintained?

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

tts-uk

Text-to-Speech for Ukrainian

Statuses

Demo

Features

Installation

Getting started

Get help and support

License

Model authors

Acoustic

Vocoder

Community

Acknowledgements

Keywords

Related posts

Introducing License Overlays: Smarter License Management for Real-World Code

Introducing Rust Support in Socket