New Case Study:See how Anthropic automated 95% of dependency reviews with Socket.Learn More
Socket
Sign inDemoInstall
Socket

en-tts

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

en-tts

Web app, command-line interface and Python library for synthesizing English texts into speech.

  • 0.0.2
  • PyPI
  • Socket score

Maintainers
1

en-tts

PyPI PyPI Hugging Face 🤗 pytorch MIT PyPI PyPI PyPI DOI

Web app, command-line interface and Python library for synthesizing English texts into speech.

Installation

pip install en-tts --user

Usage as web app

Visit 🤗 Hugging Face for a live demo.

Screenshot Hugging Face

You can also run it locally be executing en-tts-web in CLI and opening your browser on http://127.0.0.1:7860.

Usage as CLI

en-tts-cli synthesize "When the sunlight strikes raindrops in the air, they act as a prism and form a rainbow."

The output can be listened here.

Usage as library

from pathlib import Path
from tempfile import gettempdir

from en_tts import Synthesizer, Transcriber, normalize_audio, save_audio

text = "When the sunlight strikes raindrops in the air, they act as a prism and form a rainbow."

transcriber = Transcriber()
synthesizer = Synthesizer()

text_ipa = transcriber.transcribe_to_ipa(text)
audio = synthesizer.synthesize(text_ipa)

tmp_dir = Path(gettempdir())
save_audio(audio, tmp_dir / "output.wav")

# Optional: normalize output
normalize_audio(tmp_dir / "output.wav", tmp_dir / "output_norm.wav")

Model info

The used TTS model is published here.

Evaluation results:

  • MOS naturalness: 3.55 ± 0.28 (GT: 4.17 ± 0.23)
  • MOS intelligibility: 4.44 ± 0.24 (GT: 4.63 ± 0.19)
  • Mean MCD-DTW: 29.15
  • Mean penalty: 0.1018

Phoneme set

  • Vowels: i, u, æ, ɑ, ɔ, ə, ɛ, ɪ, ʊ, ʌ
  • Diphthongs: aɪ, aʊ, eɪ, oʊ, ɔɪ
  • R-colored vowels: ɔr, ər, ɛr, ɪr, ʊr, ʌr
  • Consonants: b, d, dʒ, f, h, j, k, l, m, n, p, r, s, t, tʃ, v, w, z, ð, ŋ, ɡ, ʃ, θ
  • Breaks:
    • SIL0 (no break)
    • SIL1 (short break)
    • SIL2 (break)
    • SIL3 (long break)
  • Special characters: . ? ! , : ; - — " ' ( ) [ ]

Each vowel, diphthong, r-colored vowel and consonant can have one of these duration markers:

  • ˘ -> very short, e.g., oʊ˘
  • nothing -> normal, e.g., oʊ
  • ˑ -> half long, e.g., oʊˑ
  • ː -> long, e.g., oʊː

Furthermore, each vowel, diphthong and r-colored vowel can have a leading stress symbol attached:

  • ˈ -> primary stress, e.g., ˈoʊ
  • ˌ -> secondary stress, e.g., ˌoʊ
  • nothing -> no stress, e.g., oʊ

Stress and duration markers can be combined, e.g., ˌoʊː

Citation

If you want to cite this repo, you can use the BibTeX-entry generated by GitHub (see About => Cite this repository).

Acknowledgments

Funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – Project-ID 416228727 – CRC 1410

The authors gratefully acknowledge the GWK support for funding this project by providing computing time through the Center for Information Services and HPC (ZIH) at TU Dresden.

The authors are grateful to the Center for Information Services and High Performance Computing [Zentrum fur Informationsdienste und Hochleistungsrechnen (ZIH)] at TU Dresden for providing its facilities for high throughput calculations.

Keywords

FAQs


Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc