Security News
The Unpaid Backbone of Open Source: Solo Maintainers Face Increasing Security Demands
Solo open source maintainers face burnout and security challenges, with 60% unpaid and 60% considering quitting.
Tests | |
---|---|
Documentation | |
Release | |
Citation |
The phonemizer allows simple phonemization of words and texts in many languages.
Provides both the phonemize
command-line tool and the Python function
phonemizer.phonemize
. See the package's documentation.
It is based on four backends: espeak, espeak-mbrola, festival and segments. The backends have different properties and capabilities resumed in table below. The backend choice is let to the user.
espeak-ng is a Text-to-Speech software supporting a lot of languages and IPA (International Phonetic Alphabet) output.
espeak-ng-mbrola uses the SAMPA phonetic alphabet instead of IPA but does not preserve word boundaries.
festival is another Tex-to-Speech engine. Its phonemizer backend currently supports only American English. It uses a custom phoneset, but it allows tokenization at the syllable level.
segments is a Unicode tokenizer that build a phonemization from a grapheme to phoneme mapping provided as a file by the user.
espeak | espeak-mbrola | festival | segments | |
---|---|---|---|---|
phone set | IPA | SAMPA | custom | user defined |
supported languages | 100+ | 35 | US English | user defined |
processing speed | fast | slow | very slow | fast |
phone tokens | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
syllable tokens | :x: | :x: | :heavy_check_mark: | :x: |
word tokens | :heavy_check_mark: | :x: | :heavy_check_mark: | :heavy_check_mark: |
punctuation preservation | :heavy_check_mark: | :x: | :heavy_check_mark: | :heavy_check_mark: |
stressed phones | :heavy_check_mark: | :x: | :x: | :x: |
tie | :heavy_check_mark: | :x: | :x: | :x: |
To refenrece the phonemizer
in your own work, please cite the following JOSS
paper.
@article{Bernard2021,
doi = {10.21105/joss.03958},
url = {https://doi.org/10.21105/joss.03958},
year = {2021},
publisher = {The Open Journal},
volume = {6},
number = {68},
pages = {3958},
author = {Mathieu Bernard and Hadrien Titeux},
title = {Phonemizer: Text to Phones Transcription for Multiple Languages in Python},
journal = {Journal of Open Source Software}
}
Copyright 2015-2021 Mathieu Bernard
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.
FAQs
Simple text to phones converter for multiple languages
We found that phonemizer demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 2 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Solo open source maintainers face burnout and security challenges, with 60% unpaid and 60% considering quitting.
Security News
License exceptions modify the terms of open source licenses, impacting how software can be used, modified, and distributed. Developers should be aware of the legal implications of these exceptions.
Security News
A developer is accusing Tencent of violating the GPL by modifying a Python utility and changing its license to BSD, highlighting the importance of copyleft compliance.