Security News
PyPI’s New Archival Feature Closes a Major Security Gap
PyPI now allows maintainers to archive projects, improving security and helping users make informed decisions about their dependencies.
framework for synchronous batch speech-to-text transcription using backends like AWS, Watson, etc.
Implementation-agnostic framework for synchronous batch text-to-speech transcription with backend services such as AWS, Watson, etc.
This module itself does NOT include a full implementation or an integration with any transcription service. The intention instead is that you include a specific implementation in your project. For example, for AWS Transcribe, use (py-transcribe-aws)[https://github.com/ICTLearningSciences/py-transcribe-aws]
pip install py-transcribe
You first need to install some concrete implementation of py-transcribe. If you are using AWS, then you can install transcribe-aws
like this:
pip install py-transcribe-aws
...once the implementation is installed, you can configure that one of two ways:
Set ENV var TRANSCRIBE_MODULE_PATH
, e.g.
export TRANSCRIBE_MODULE_PATH=transcribe_aws
or pass the module path at service-creation time, e.g.
from transcribe import init_transcription_service
service = init_transcription_service(
module_path="transcribe_aws"
)
Once you're set up, basic usage looks like this:
from transcribe import (
init_transcription_service
TranscribeJobRequest,
TranscribeJobStatus
)
service = init_transcription_service()
result = service.transcribe([
TranscribeJobRequest(
sourceFile="/some/path/j1.wav"
),
TranscribeJobRequest(
sourceFile="/some/other/path/j2.wav"
)
])
for j in result.jobs():
if j.status == TranscribeJoStatus.SUCCEEDED:
print(j.transcript)
else:
print(j.error)
The main transcribe method is synchronous to hide the async/polling-based complexity of most transcribe services. But for any non-trivial batch of transcriptions, you probably do want to receive periodic updates, for example to save any completed transcriptions. You can do that by passing an on_update
callback as follows:
from transcribe import (
init_transcription_service
TranscribeJobRequest,
TranscribeJobStatus,
TranscribeJobsUpdate
)
service = init_transcription_service()
def _on_update(u: TranscribeJobsUpdate) -> None:
for j in u.jobs():
if j.status == TranscribeJoStatus.SUCCEEDED:
print(f"save result: {j.transcript}")
else:
print(j.error)
result = service.transcribe(
[
TranscribeJobRequest(
sourceFile="/some/path/j1.wav"
),
TranscribeJobRequest(
sourceFile="/some/other/path/j2.wav"
)
],
on_update=_on_update
)
Most implementations will also require other configuration, which you can either set in your environment or pass to init_transcription_service
as config={}
. See your implementation docs for details.
Run tests during development with
make test-all
Once ready to release, create a release tag, currently using semver-ish numbering, e.g. 1.0.0(-alpha.1)
FAQs
framework for synchronous batch speech-to-text transcription using backends like AWS, Watson, etc.
We found that py-transcribe demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
PyPI now allows maintainers to archive projects, improving security and helping users make informed decisions about their dependencies.
Research
Security News
Malicious npm package postcss-optimizer delivers BeaverTail malware, targeting developer systems; similarities to past campaigns suggest a North Korean connection.
Security News
CISA's KEV data is now on GitHub, offering easier access, API integration, commit history tracking, and automated updates for security teams and researchers.