
Research
/Security News
Critical Vulnerability in NestJS Devtools: Localhost RCE via Sandbox Escape
A flawed sandbox in @nestjs/devtools-integration lets attackers run code on your machine via CSRF, leading to full Remote Code Execution (RCE).
pronunciation-dictionary-utils
Advanced tools
CLI and library to modify pronunciation dictionaries (any language).
Library and CLI to modify pronunciation dictionaries (any language).
export-vocabulary
: export vocabulary from dictionariesexport-phonemes
: export phoneme set from dictionariesmerge
: merge dictionaries togetherextract
: extract subset of dictionary vocabularymap-symbols-in-pronunciations
: map phonemes/symbols in pronunciations to another phoneme/symbol, e.g., mapping ARPAbet to IPAmap-symbols-in-pronunciations-json
: map phonemes/symbols in pronunciations to phoneme/symbol specified in fileremove-symbols-from-vocabulary
: remove phonemes/symbols from vocabularyremove-symbols-from-pronunciations
: remove phonemes/symbols from pronunciationsremove-symbols-from-words
: remove characters/symbols from wordschange-formatting
: change formatting of dictionariesselect-single-pronunciation
: select single pronunciationchange-word-casing
: transform all words to upper- or lower-casesort-words
: sort dictionary after wordssort-pronunciations
: sort dictionary pronunciationsnormalize-weights
: normalize pronunciation weights for each wordpip install pronunciation-dictionary-utils --user
usage: dict-cli [-h] [-v]
{export-vocabulary,export-phonemes,merge,extract,map-symbols-in-pronunciations,map-symbols-in-pronunciations-json,remove-symbols-from-vocabulary,remove-symbols-from-pronunciations,remove-symbols-from-words,change-formatting,select-single-pronunciation,change-word-casing,sort-words,sort-pronunciations,normalize-weights}
...
This program provides methods to modify pronunciation dictionaries.
positional arguments:
{export-vocabulary,export-phonemes,merge,extract,map-symbols-in-pronunciations,map-symbols-in-pronunciations-json,remove-symbols-from-vocabulary,remove-symbols-from-pronunciations,remove-symbols-from-words,change-formatting,select-single-pronunciation,change-word-casing,sort-words,sort-pronunciations,normalize-weights}
description
export-vocabulary export vocabulary from dictionaries
export-phonemes export phoneme set from dictionaries
merge merge dictionaries together
extract extract subset of dictionary vocabulary
map-symbols-in-pronunciations map phonemes/symbols in pronunciations to another phoneme/symbol, e.g., mapping ARPAbet to IPA
map-symbols-in-pronunciations-json map phonemes/symbols in pronunciations to phoneme/symbol specified in file
remove-symbols-from-vocabulary remove phonemes/symbols from vocabulary
remove-symbols-from-pronunciations remove phonemes/symbols from pronunciations
remove-symbols-from-words remove characters/symbols from words
change-formatting change formatting of dictionaries
select-single-pronunciation select single pronunciation
change-word-casing transform all words to upper- or lower-case
sort-words sort dictionary after words
sort-pronunciations sort dictionary pronunciations
normalize-weights normalize pronunciation weights for each word
optional arguments:
-h, --help show this help message and exit
-v, --version show program's version number and exit
# Download CMU dictionary
wget https://raw.githubusercontent.com/cmusphinx/cmudict/master/cmudict.dict \
-O "/tmp/example.dict"
# Change formatting to remove numbers from words, comments and save as UTF-8
dict-cli change-formatting \
"/tmp/example.dict" \
--deserialization-encoding "ISO-8859-1" \
--consider-numbers \
--consider-pronunciation-comments \
--serialization-encoding "UTF-8"
# Export phoneme set
dict-cli export-phonemes \
"/tmp/example.dict" \
"/tmp/example-phoneme-set.txt"
# Export vocabulary
dict-cli export-vocabulary \
"/tmp/example.dict" \
"/tmp/example-vocabulary.txt"
# Keep first pronunciation for each word and discard the rest
dict-cli select-single-pronunciation \
"/tmp/example.dict" \
--mode "first"
# Replace all "ER0" phonemes with "ER"
dict-cli map-symbols-in-pronunciations \
"/tmp/example.dict" \
"ER0" "ER"
# update
sudo apt update
# install Python 3.8-3.12 for ensuring that tests can be run
sudo apt install python3-pip \
python3.8 python3.8-dev python3.8-distutils python3.8-venv \
python3.9 python3.9-dev python3.9-distutils python3.9-venv \
python3.10 python3.10-dev python3.10-distutils python3.10-venv \
python3.11 python3.11-dev python3.11-distutils python3.11-venv \
python3.12 python3.12-dev python3.12-distutils python3.12-venv
# install pipenv for creation of virtual environments
python3.8 -m pip install pipenv --user
# check out repo
git clone https://github.com/stefantaubert/pronunciation-dictionary-utils.git
cd pronunciation-dictionary-utils
# create virtual environment
python3.8 -m pipenv install --dev
# first install the tool like in "Development setup"
# then, navigate into the directory of the repo (if not already done)
cd pronunciation-dictionary-utils
# activate environment
python3.8 -m pipenv shell
# run tests
tox
Final lines of test result output:
py38: commands succeeded
py39: commands succeeded
py310: commands succeeded
py311: commands succeeded
py312: commands succeeded
congratulations :)
Funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – Project-ID 416228727 – CRC 1410
If you want to cite this repo, you can use this BibTeX-entry generated by GitHub (see About => Cite this repository).
Taubert, S., and Przybysz, N. (2024). pronunciation-dictionary-utils (Version 0.0.5) [Computer software]. https://doi.org/10.5281/zenodo.10560153
FAQs
CLI and library to modify pronunciation dictionaries (any language).
We found that pronunciation-dictionary-utils demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
/Security News
A flawed sandbox in @nestjs/devtools-integration lets attackers run code on your machine via CSRF, leading to full Remote Code Execution (RCE).
Product
Customize license detection with Socket’s new license overlays: gain control, reduce noise, and handle edge cases with precision.
Product
Socket now supports Rust and Cargo, offering package search for all users and experimental SBOM generation for enterprise projects.