
Security News
MCP Community Begins Work on Official MCP Metaregistry
The MCP community is launching an official registry to standardize AI tool discovery and let agents dynamically find and install MCP servers.
Evaluating ASR (automatic speech recognition) hypotheses, i.e. computing word error rate.
Python module for evaluting ASR hypotheses (i.e. word error rate and word recognition rate).
This module depends on the editdistance project, for computing edit distances between arbitrary sequences.
The formatting of the output of this program is very loosely based around the same idea as the align.c program commonly used within the Sphinx ASR community. This may run a bit faster if neither instances nor confusions are printed.
Please let me know if you have any comments, questions, or problems.
The program outputs three standard measurements:
The easiest way to install is using pip:
pip install asr-evaluation
Alternatively you can clone this git repo and install using distutils:
git clone git@github.com:belambert/asr-evaluation.git
cd asr-evaluation
python setup.py install
To uninstall with pip:
pip uninstall asr-evaluation
For command line usage, see:
wer --help
It should display something like this:
usage: wer [-h] [-i | -r] [--head-ids] [-id] [-c] [-p] [-m count] [-a] [-e]
ref hyp
Evaluate an ASR transcript against a reference transcript.
positional arguments:
ref Reference transcript filename
hyp ASR hypothesis filename
optional arguments:
-h, --help show this help message and exit
-i, --print-instances
Print all individual sentences and their errors.
-r, --print-errors Print all individual sentences that contain errors.
--head-ids Hypothesis and reference files have ids in the first
token? (Kaldi format)
-id, --tail-ids, --has-ids
Hypothesis and reference files have ids in the last
token? (Sphinx format)
-c, --confusions Print tables of which words were confused.
-p, --print-wer-vs-length
Print table of average WER grouped by reference
sentence length.
-m count, --min-word-count count
Minimum word count to show a word in confusions.
-a, --case-insensitive
Down-case the text before running the evaluation.
-e, --remove-empty-refs
Skip over any examples where the reference is empty.
For contributions, it's best to Github issues and pull requests. Proper testing and documentation suggested.
Code of conduct is expected to be reasonable, especially as specified by the Contributor Covenant
FAQs
Evaluating ASR (automatic speech recognition) hypotheses, i.e. computing word error rate.
We found that asr_evaluation demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
The MCP community is launching an official registry to standardize AI tool discovery and let agents dynamically find and install MCP servers.
Research
Security News
Socket uncovers an npm Trojan stealing crypto wallets and BullX credentials via obfuscated code and Telegram exfiltration.
Research
Security News
Malicious npm packages posing as developer tools target macOS Cursor IDE users, stealing credentials and modifying files to gain persistent backdoor access.