Research
Recent Trends in Malicious Packages Targeting Discord
The Socket research team breaks down a sampling of malicious packages that download and execute files, among other suspicious behaviors, targeting the popular Discord platform.
speech-dataset-parser
Library to parse speech datasets stored in a generic format based on TextGrids. A tool (CLI) for converting common datasets like LJ Speech into a generic format is included.
Readme
Library to parse speech datasets stored in a generic format based on TextGrids. A tool (CLI) for converting common datasets like LJ Speech into a generic format is included.
Speech datasets consists of pairs of .TextGrid and .wav files. The TextGrids need to contain a tier which has each symbol separated in an interval, e.g., T|h|i|s| |i|s| |a| |t|e|x|t|.
The format is as follows: {Dataset name}/{Speaker name};{Speaker gender};{Speaker language}[;{Speaker accent}]/[Subfolder(s)]/{Recordings as .wav- and .TextGrid-pairs}
Example: LJ Speech/Linda Johnson;2;eng;North American/wavs/...
Speaker names can be any string (excluding ;
symbols).
Genders are defined via their ISO/IEC 5218 Code.
Languages are defined via their ISO 639-2 Code (bibliographic).
Accents are optional and can be any string (excluding ;
symbols).
pip install speech-dataset-parser --user
from speech_dataset_parser import parse_dataset
entries = list(parse_dataset({folder}, {grid-tier-name}))
The resulting entries
list contains dataclass-instances with these properties:
symbols: Tuple[str, ...]
: contains the mark of each intervalintervals: Tuple[float, ...]
: contains the max-time of each intervalsymbols_language: str
: contains the languagespeaker_name: str
: contains the name of the speakerspeaker_accent: str
: contains the accent of the speakerspeaker_gender: int
: contains the gender of the speakeraudio_file_abs: Path
: contains the absolute path to the speech audiomin_time: float
: the min-time of the gridmax_time: float
: the max-time of the grid (equal to intervals[-1]
)usage: dataset-converter-cli [-h] [-v] {convert-ljs,convert-l2arctic,convert-thchs,convert-thchs-cslt,restore-structure} ...
This program converts common speech datasets into a generic representation.
positional arguments:
{convert-ljs,convert-l2arctic,convert-thchs,convert-thchs-cslt,restore-structure}
description
convert-ljs convert LJ Speech dataset to a generic dataset
convert-l2arctic convert L2-ARCTIC dataset to a generic dataset
convert-thchs convert THCHS-30 (OpenSLR Version) dataset to a generic dataset
convert-thchs-cslt convert THCHS-30 (CSLT Version) dataset to a generic dataset
restore-structure restore original dataset structure of generic datasets
optional arguments:
-h, --help show this help message and exit
-v, --version show program's version number and exit
# Convert LJ Speech dataset with symbolic links to the audio files
dataset-converter-cli convert-ljs \
"/data/datasets/LJSpeech-1.1" \
"/tmp/ljs" \
--tier "Symbols" \
--symlink
tqdm
TextGrid>=1.5
ordered_set>=4.1.0
importlib_resources; python_version < '3.8'
If you notice an error, please don't hesitate to open an issue.
# update
sudo apt update
# install Python 3.7, 3.8, 3.9, 3.10 & 3.11 for ensuring that tests can be run
sudo apt install python3-pip \
python3.7 python3.7-dev python3.7-distutils python3.7-venv \
python3.8 python3.8-dev python3.8-distutils python3.8-venv \
python3.9 python3.9-dev python3.9-distutils python3.9-venv \
python3.10 python3.10-dev python3.10-distutils python3.10-venv \
python3.11 python3.11-dev python3.11-distutils python3.11-venv
# install pipenv for creation of virtual environments
python3.8 -m pip install pipenv --user
# check out repo
git clone https://github.com/stefantaubert/speech-dataset-parser.git
cd speech-dataset-parser
# create virtual environment
python3.8 -m pipenv install --dev
# first install the tool like in "Development setup"
# then, navigate into the directory of the repo (if not already done)
cd speech-dataset-parser
# activate environment
python3.8 -m pipenv shell
# run tests
tox
Final lines of test result output:
py37: commands succeeded
py38: commands succeeded
py39: commands succeeded
py310: commands succeeded
py311: commands succeeded
congratulations :)
MIT License
Funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – Project-ID 416228727 – CRC 1410
If you want to cite this repo, you can use this BibTeX-entry generated by GitHub (see About => Cite this repository).
convert-thchs-cslt
FAQs
Library to parse speech datasets stored in a generic format based on TextGrids. A tool (CLI) for converting common datasets like LJ Speech into a generic format is included.
We found that speech-dataset-parser demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
The Socket research team breaks down a sampling of malicious packages that download and execute files, among other suspicious behaviors, targeting the popular Discord platform.
Security News
Socket CEO Feross Aboukhadijeh joins a16z partners to discuss how modern, sophisticated supply chain attacks require AI-driven defenses and explore the challenges and solutions in leveraging AI for threat detection early in the development life cycle.
Security News
NIST's new AI Risk Management Framework aims to enhance the security and reliability of generative AI systems and address the unique challenges of malicious AI exploits.