Security News
JSR Working Group Kicks Off with Ambitious Roadmap and Plans for Open Governance
At its inaugural meeting, the JSR Working Group outlined plans for an open governance model and a roadmap to enhance JavaScript package management.
deepspeech-gpu
Advanced tools
.. image:: https://readthedocs.org/projects/deepspeech/badge/?version=latest :target: http://deepspeech.readthedocs.io/?badge=latest :alt: Documentation
.. image:: https://community-tc.services.mozilla.com/api/github/v1/repository/mozilla/DeepSpeech/master/badge.svg :target: https://community-tc.services.mozilla.com/api/github/v1/repository/mozilla/DeepSpeech/master/latest :alt: Task Status
DeepSpeech is an open source Speech-To-Text engine, using a model trained by machine learning techniques based on Baidu's Deep Speech research paper <https://arxiv.org/abs/1412.5567>
. Project DeepSpeech uses Google's TensorFlow <https://www.tensorflow.org/>
to make the implementation easier.
To install and use deepspeech all you have to do is:
.. code-block:: bash
virtualenv -p python3 $HOME/tmp/deepspeech-venv/ source $HOME/tmp/deepspeech-venv/bin/activate
pip3 install deepspeech
curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.5.1/deepspeech-0.5.1-models.tar.gz tar xvf deepspeech-0.5.1-models.tar.gz
curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.5.1/audio-0.5.1.tar.gz tar xvf audio-0.5.1.tar.gz
deepspeech --model deepspeech-0.5.1-models/output_graph.pbmm --lm deepspeech-0.5.1-models/lm.binary --trie deepspeech-0.5.1-models/trie --audio audio/2830-3980-0043.wav
A pre-trained English model is available for use and can be downloaded using the instructions below <USING.rst#using-a-pre-trained-model>
. Currently, only 16-bit, 16 kHz, mono-channel WAVE audio files are supported in the Python client. A package with some example audio files is available for download in our release notes <https://github.com/mozilla/DeepSpeech/releases/latest>
.
Quicker inference can be performed using a supported NVIDIA GPU on Linux. See the release notes <https://github.com/mozilla/DeepSpeech/releases/latest>
_ to find which GPUs are supported. To run deepspeech
on a GPU, install the GPU specific package:
.. code-block:: bash
virtualenv -p python3 $HOME/tmp/deepspeech-gpu-venv/ source $HOME/tmp/deepspeech-gpu-venv/bin/activate
pip3 install deepspeech-gpu
deepspeech --model deepspeech-0.5.1-models/output_graph.pbmm --lm deepspeech-0.5.1-models/lm.binary --trie deepspeech-0.5.1-models/trie --audio audio/2830-3980-0043.wav
Please ensure you have the required CUDA dependencies <USING.rst#cuda-dependency>
_.
See the output of deepspeech -h
for more information on the use of deepspeech
. (If you experience problems running deepspeech
\ , please check required runtime dependencies <native_client/README.rst#required-dependencies>
_\ ).
Table of Contents
Using a Pre-trained Model <USING.rst#using-a-pre-trained-model>
_
CUDA dependency <USING.rst#cuda-dependency>
_Getting the pre-trained model <USING.rst#getting-the-pre-trained-model>
_Model compatibility <USING.rst#model-compatibility>
_Using the Python package <USING.rst#using-the-python-package>
_Using the Node.JS package <USING.rst#using-the-nodejs-package>
_Using the Command Line client <USING.rst#using-the-command-line-client>
_Installing bindings from source <USING.rst#installing-bindings-from-source>
_Third party bindings <USING.rst#third-party-bindings>
_Trying out DeepSpeech with examples <examples/EXAMPLES.rst>
_
Microphone VAD streaming <examples/mic_vad_streaming/README.rst>
_
FFMPEG VAD streaming <examples/ffmpeg_vad_streaming/README.rst>
_
Net framework <examples/net_framework/README.rst>
_
Nodejs wav <examples/nodejs_wav/README.rst>
_
VAD transcriber <examples/vad_transcriber/README.rst>
_
Training your own Model <TRAINING.rst#training-your-own-model>
_
Prerequisites for training a model <TRAINING.rst#prerequisites-for-training-a-model>
_Getting the training code <TRAINING.rst#getting-the-training-code>
_Installing Python dependencies <TRAINING.rst#installing-python-dependencies>
_Recommendations <TRAINING.rst#recommendations>
_Common Voice training data <TRAINING.rst#common-voice-training-data>
_Training a model <TRAINING.rst#training-a-model>
_Checkpointing <TRAINING.rst#checkpointing>
_Exporting a model for inference <TRAINING.rst#exporting-a-model-for-inference>
_Exporting a model for TFLite <TRAINING.rst#exporting-a-model-for-tflite>
_Making a mmap-able model for inference <TRAINING.rst#making-a-mmap-able-model-for-inference>
_Continuing training from a release model <TRAINING.rst#continuing-training-from-a-release-model>
_Training with Augmentation <TRAINING.rst#training-with-augmentation>
_Contribution guidelines <CONTRIBUTING.rst>
_
Contact/Getting Help <SUPPORT.rst>
_
FAQs
DeepSpeech NodeJS bindings
We found that deepspeech-gpu demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
At its inaugural meeting, the JSR Working Group outlined plans for an open governance model and a roadmap to enhance JavaScript package management.
Security News
Research
An advanced npm supply chain attack is leveraging Ethereum smart contracts for decentralized, persistent malware control, evading traditional defenses.
Security News
Research
Attackers are impersonating Sindre Sorhus on npm with a fake 'chalk-node' package containing a malicious backdoor to compromise developers' projects.