Cheetah Binding for Python
Cheetah Speech-to-Text Engine
Made in Vancouver, Canada by Picovoice
Cheetah is an on-device streaming speech-to-text engine. Cheetah is:
- Private; All voice processing runs locally.
- Accurate
- Compact and Computationally-Efficient
- Cross-Platform:
- Linux (x86_64), macOS (x86_64, arm64), and Windows (x86_64)
- Android and iOS
- Chrome, Safari, Firefox, and Edge
- Raspberry Pi (3, 4, 5)
Compatibility
- Python 3.8+
- Runs on Linux (x86_64), macOS (x86_64, arm64), Windows (x86_64), and Raspberry Pi (3, 4, 5).
Installation
pip3 install pvcheetah
AccessKey
Cheetah requires a valid Picovoice AccessKey
at initialization. AccessKey
acts as your credentials when using Cheetah SDKs.
You can get your AccessKey
for free. Make sure to keep your AccessKey
secret.
Signup or Login to Picovoice Console to get your AccessKey
.
Usage
Create an instance of the engine and transcribe audio:
import pvcheetah
handle = pvcheetah.create(access_key='${ACCESS_KEY}')
def get_next_audio_frame():
pass
while True:
partial_transcript, is_endpoint = handle.process(get_next_audio_frame())
if is_endpoint:
final_transcript = handle.flush()
Replace ${ACCESS_KEY}
with yours obtained from Picovoice Console. When done be sure
to explicitly release the resources using handle.delete()
.
Language Model
The Cheetah Python SDK comes preloaded with a default English language model (.pv
file).
Default models for other supported languages can be found in lib/common.
Create custom language models using the Picovoice Console. Here you can train
language models with custom vocabulary and boost words in the existing vocabulary.
Pass in the .pv
file via the model_path
argument:
cheetah = pvcheetah.create(
access_key='${ACCESS_KEY}',
model_path='${MODEL_FILE_PATH}')
Demos
pvcheetahdemo provides command-line utilities for processing audio using
Cheetah.