AssemblyAI's Python SDK
Build with AI models that can transcribe and understand audio
With a single API call, get access to AI models built on the latest AI breakthroughs to transcribe and understand audio and speech data securely at large scale.
Overview
Documentation
Visit our AssemblyAI API Documentation to get an overview of our models!
Quick Start
Installation
pip install -U assemblyai
Examples
Before starting, you need to set the API key. If you don't have one yet, sign up for one!
import assemblyai as aai
aai.settings.api_key = f"{ASSEMBLYAI_API_KEY}"
Core Examples
Transcribe a Local Audio File
import assemblyai as aai
transcriber = aai.Transcriber()
transcript = transcriber.transcribe("./my-local-audio-file.wav")
print(transcript.text)
Transcribe an URL
import assemblyai as aai
transcriber = aai.Transcriber()
transcript = transcriber.transcribe("https://example.org/audio.mp3")
print(transcript.text)
Transcribe binary data
import assemblyai as aai
transcriber = aai.Transcriber()
transcript = transcriber.transcribe(data)
upload_url = transcriber.upload_file(data)
transcript = transcriber.transcribe(upload_url)
Export Subtitles of an Audio File
import assemblyai as aai
transcriber = aai.Transcriber()
transcript = transcriber.transcribe("https://example.org/audio.mp3")
print(transcript.export_subtitles_srt())
print(transcript.export_subtitles_vtt())
List all Sentences and Paragraphs
import assemblyai as aai
transcriber = aai.Transcriber()
transcript = transcriber.transcribe("https://example.org/audio.mp3")
sentences = transcript.get_sentences()
for sentence in sentences:
print(sentence.text)
paragraphs = transcript.get_paragraphs()
for paragraph in paragraphs:
print(paragraph.text)
Search for Words in a Transcript
import assemblyai as aai
transcriber = aai.Transcriber()
transcript = transcriber.transcribe("https://example.org/audio.mp3")
matches = transcript.word_search(["price", "product"])
for match in matches:
print(f"Found '{match.text}' {match.count} times in the transcript")
Add Custom Spellings on a Transcript
import assemblyai as aai
config = aai.TranscriptionConfig()
config.set_custom_spelling(
{
"Kubernetes": ["k8s"],
"SQL": ["Sequel"],
}
)
transcriber = aai.Transcriber()
transcript = transcriber.transcribe("https://example.org/audio.mp3", config)
print(transcript.text)
Upload a file
import assemblyai as aai
transcriber = aai.Transcriber()
upload_url = transcriber.upload_file(data)
Delete a transcript
import assemblyai as aai
transcript = aai.Transcriber().transcribe(audio_url)
aai.Transcript.delete_by_id(transcript.id)
List transcripts
This returns a page of transcripts you created.
import assemblyai as aai
transcriber = aai.Transcriber()
page = transcriber.list_transcripts()
print(page.page_details)
print(page.transcripts)
You can apply filter parameters:
params = aai.ListTranscriptParameters(
limit=3,
status=aai.TranscriptStatus.completed,
)
page = transcriber.list_transcripts(params)
You can also paginate over all pages by using the helper property before_id_of_prev_url
.
The prev_url
always points to a page with older transcripts. If you extract the before_id
of the prev_url
query parameters, you can paginate over all pages from newest to oldest.
transcriber = aai.Transcriber()
params = aai.ListTranscriptParameters()
page = transcriber.list_transcripts(params)
while page.page_details.before_id_of_prev_url is not None:
params.before_id = page.page_details.before_id_of_prev_url
page = transcriber.list_transcripts(params)
LeMUR Examples
Use LeMUR to Summarize Multiple Transcripts
import assemblyai as aai
transcriber = aai.Transcriber()
transcript_group = transcriber.transcribe_group(
[
"https://example.org/customer1.mp3",
"https://example.org/customer2.mp3",
],
)
result = transcript_group.lemur.summarize(
context="Customers asking for cars",
answer_format="TLDR"
)
print(result.response)
Use LeMUR to Ask Questions on a Single Transcript
import assemblyai as aai
transcriber = aai.Transcriber()
transcript = transcriber.transcribe("https://example.org/customer.mp3")
questions = [
aai.LemurQuestion(question="What car was the customer interested in?"),
aai.LemurQuestion(question="What price range is the customer looking for?"),
]
result = transcript.lemur.question(questions)
for q in result.response:
print(f"Question: {q.question}")
print(f"Answer: {q.answer}")
Use LeMUR to Ask for Action Items from a Single Transcript
import assemblyai as aai
transcriber = aai.Transcriber()
transcript = transcriber.transcribe("https://example.org/customer.mp3")
result = transcript.lemur.action_items(
context="Customers asking for help with resolving their problem",
answer_format="Three bullet points",
)
print(result.response)
Use LeMUR to Ask Anything with a Custom Prompt
import assemblyai as aai
transcriber = aai.Transcriber()
transcript = transcriber.transcribe("https://example.org/customer.mp3")
result = transcript.lemur.task(
"You are a helpful coach. Provide an analysis of the transcript "
"and offer areas to improve with exact quotes. Include no preamble. "
"Start with an overall summary then get into the examples with feedback.",
)
print(result.response)
Use LeMUR to with Input Text
import assemblyai as aai
transcriber = aai.Transcriber()
config = aai.TranscriptionConfig(
speaker_labels=True,
)
transcript = transcriber.transcribe("https://example.org/customer.mp3", config=config)
text = ""
for utt in transcript.utterances:
text += f"Speaker {utt.speaker}:\n{utt.text}\n"
result = aai.Lemur().task(
"You are a helpful coach. Provide an analysis of the transcript "
"and offer areas to improve with exact quotes. Include no preamble. "
"Start with an overall summary then get into the examples with feedback.",
input_text=text
)
print(result.response)
Delete data previously sent to LeMUR
import assemblyai as aai
transcriber = aai.Transcriber()
transcript_group = transcriber.transcribe_group(
[
"https://example.org/customer1.mp3",
],
)
result = transcript_group.lemur.summarize(
context="Customers providing sensitive, personally identifiable information",
answer_format="TLDR"
)
request_id = result.request_id
deletion_result = aai.Lemur.purge_request_data(request_id)
print(deletion_result)
Audio Intelligence Examples
PII Redact a Transcript
import assemblyai as aai
config = aai.TranscriptionConfig()
config.set_redact_pii(
policies=[
aai.PIIRedactionPolicy.credit_card_number,
aai.PIIRedactionPolicy.email_address,
aai.PIIRedactionPolicy.location,
aai.PIIRedactionPolicy.person_name,
aai.PIIRedactionPolicy.phone_number,
],
substitution=aai.PIISubstitutionPolicy.hash,
)
transcriber = aai.Transcriber()
transcript = transcriber.transcribe("https://example.org/audio.mp3", config)
To request a copy of the original audio file with the redacted information "beeped" out, set redact_pii_audio=True
in the config.
Once the Transcript
object is returned, you can access the URL of the redacted audio file with get_redacted_audio_url
, or save the redacted audio directly to disk with save_redacted_audio
.
import assemblyai as aai
transcript = aai.Transcriber().transcribe(
"https://example.org/audio.mp3",
config=aai.TranscriptionConfig(
redact_pii=True,
redact_pii_policies=[aai.PIIRedactionPolicy.person_name],
redact_pii_audio=True
)
)
redacted_audio_url = transcript.get_redacted_audio_url()
transcript.save_redacted_audio("redacted_audio.mp3")
Read more about PII redaction here.
Summarize the content of a transcript over time
import assemblyai as aai
transcriber = aai.Transcriber()
transcript = transcriber.transcribe(
"https://example.org/audio.mp3",
config=aai.TranscriptionConfig(auto_chapters=True)
)
for chapter in transcript.chapters:
print(f"Summary: {chapter.summary}")
print(f"Start: {chapter.start}, End: {chapter.end}")
print(f"Healine: {chapter.headline}")
print(f"Gist: {chapter.gist}")
Read more about auto chapters here.
Summarize the content of a transcript
import assemblyai as aai
transcriber = aai.Transcriber()
transcript = transcriber.transcribe(
"https://example.org/audio.mp3",
config=aai.TranscriptionConfig(summarization=True)
)
print(transcript.summary)
By default, the summarization model will be informative
and the summarization type will be bullets
. Read more about summarization models and types here.
To change the model and/or type, pass additional parameters to the TranscriptionConfig
:
config=aai.TranscriptionConfig(
summarization=True,
summary_model=aai.SummarizationModel.catchy,
summary_type=aai.SummarizationType.headline
)
Detect Sensitive Content in a Transcript
import assemblyai as aai
transcriber = aai.Transcriber()
transcript = transcriber.transcribe(
"https://example.org/audio.mp3",
config=aai.TranscriptionConfig(content_safety=True)
)
for result in transcript.content_safety.results:
print(result.text)
print(result.timestamp.start)
print(result.timestamp.end)
for label in result.labels:
print(label.label)
print(label.confidence)
print(label.severity)
for label, confidence in transcript.content_safety.summary.items():
print(f"{confidence * 100}% confident that the audio contains {label}")
for label, severity_confidence in transcript.content_safety.severity_score_summary.items():
print(f"{severity_confidence.low * 100}% confident that the audio contains low-severity {label}")
print(f"{severity_confidence.medium * 100}% confident that the audio contains mid-severity {label}")
print(f"{severity_confidence.high * 100}% confident that the audio contains high-severity {label}")
Read more about the content safety categories.
By default, the content safety model will only include labels with a confidence greater than 0.5 (50%). To change this, pass content_safety_confidence
(as an integer percentage between 25 and 100, inclusive) to the TranscriptionConfig
:
config=aai.TranscriptionConfig(
content_safety=True,
content_safety_confidence=80,
)
Analyze the Sentiment of Sentences in a Transcript
import assemblyai as aai
transcriber = aai.Transcriber()
transcript = transcriber.transcribe(
"https://example.org/audio.mp3",
config=aai.TranscriptionConfig(sentiment_analysis=True)
)
for sentiment_result in transcript.sentiment_analysis:
print(sentiment_result.text)
print(sentiment_result.sentiment)
print(sentiment_result.confidence)
print(f"Timestamp: {sentiment_result.start} - {sentiment_result.end}")
If speaker_labels
is also enabled, then each sentiment analysis result will also include a speaker
field.
config = aai.TranscriptionConfig(sentiment_analysis=True, speaker_labels=True)
for sentiment_result in transcript.sentiment_analysis:
print(sentiment_result.speaker)
Read more about sentiment analysis here.
Identify Entities in a Transcript
import assemblyai as aai
transcriber = aai.Transcriber()
transcript = transcriber.transcribe(
"https://example.org/audio.mp3",
config=aai.TranscriptionConfig(entity_detection=True)
)
for entity in transcript.entities:
print(entity.text)
print(entity.entity_type)
print(f"Timestamp: {entity.start} - {entity.end}")
Read more about entity detection here.
Detect Topics in a Transcript (IAB Classification)
import assemblyai as aai
transcriber = aai.Transcriber()
transcript = transcriber.transcribe(
"https://example.org/audio.mp3",
config=aai.TranscriptionConfig(iab_categories=True)
)
for result in transcript.iab_categories.results:
print(result.text)
print(f"Timestamp: {result.timestamp.start} - {result.timestamp.end}")
for label in result.labels:
print(label.label)
print(label.relevance)
for label, relevance in transcript.iab_categories.summary.items():
print(f"Audio is {relevance * 100}% relevant to {label}")
Read more about IAB classification here.
Identify Important Words and Phrases in a Transcript
import assemblyai as aai
transcriber = aai.Transcriber()
transcript = transcriber.transcribe(
"https://example.org/audio.mp3",
config=aai.TranscriptionConfig(auto_highlights=True)
)
for result in transcript.auto_highlights.results:
print(result.text)
print(result.rank)
print(result.count)
for timestamp in result.timestamps:
print(f"Timestamp: {timestamp.start} - {timestamp.end}")
Read more about auto highlights here.
Real-Time Examples
Read more about our Real-Time service.
Stream your Microphone in Real-Time
import assemblyai as aai
def on_open(session_opened: aai.RealtimeSessionOpened):
"This function is called when the connection has been established."
print("Session ID:", session_opened.session_id)
def on_data(transcript: aai.RealtimeTranscript):
"This function is called when a new transcript has been received."
if not transcript.text:
return
if isinstance(transcript, aai.RealtimeFinalTranscript):
print(transcript.text, end="\r\n")
else:
print(transcript.text, end="\r")
def on_error(error: aai.RealtimeError):
"This function is called when an error occurs."
print("An error occured:", error)
def on_close():
"This function is called when the connection has been closed."
print("Closing Session")
transcriber = aai.RealtimeTranscriber(
on_data=on_data,
on_error=on_error,
sample_rate=44_100,
on_open=on_open,
on_close=on_close,
)
transcriber.connect()
microphone_stream = aai.extras.MicrophoneStream()
transcriber.stream(microphone_stream)
transcriber.close()
Transcribe a Local Audio File in Real-Time
import assemblyai as aai
def on_data(transcript: aai.RealtimeTranscript):
"This function is called when a new transcript has been received."
if not transcript.text:
return
if isinstance(transcript, aai.RealtimeFinalTranscript):
print(transcript.text, end="\r\n")
else:
print(transcript.text, end="\r")
def on_error(error: aai.RealtimeError):
"This function is called when the connection has been closed."
print("An error occured:", error)
transcriber = aai.RealtimeTranscriber(
on_data=on_data,
on_error=on_error,
sample_rate=44_100,
)
transcriber.connect()
file_stream = aai.extras.stream_file(
filepath="audio.wav",
sample_rate=44_100,
)
transcriber.stream(file_stream)
transcriber.close()
End-of-utterance controls
transcriber = aai.RealtimeTranscriber(...)
transcriber.force_end_utterance()
transcriber = aai.RealtimeTranscriber(
...,
end_utterance_silence_threshold=500
)
transcriber.configure_end_utterance_silence_threshold(300)
Disable partial transcripts
transcriber = aai.RealtimeTranscriber(
...,
disable_partial_transcripts=True
)
Enable extra session information
def on_extra_session_information(data: aai.RealtimeSessionInformation):
"This function is called when a session information message has been received."
print(data.audio_duration_seconds)
transcriber = aai.RealtimeTranscriber(
...,
on_extra_session_information=on_extra_session_information,
)
Playgrounds
Visit one of our Playgrounds:
Advanced
How the SDK handles Default Configurations
Defining Defaults
When no TranscriptionConfig
is being passed to the Transcriber
or its methods, it will use a default instance of a TranscriptionConfig
.
If you would like to re-use the same TranscriptionConfig
for all your transcriptions,
you can set it on the Transcriber
directly:
config = aai.TranscriptionConfig(punctuate=False, format_text=False)
transcriber = aai.Transcriber(config=config)
transcriber.transcribe("https://example.org/audio.wav")
Overriding Defaults
You can override the default configuration later via the .config
property of the Transcriber
:
transcriber = aai.Transcriber()
transcriber.config = aai.TranscriptionConfig(punctuate=False, format_text=False)
In case you want to override the Transcriber
's configuration for a specific operation with a different one, you can do so via the config
parameter of a .transcribe*(...)
method:
config = aai.TranscriptionConfig(punctuate=False, format_text=False)
transcriber = aai.Transcriber(config=config)
transcriber.transcribe(
"https://example.com/audio.mp3",
config=aai.TranscriptionConfig(dual_channel=True, disfluencies=True)
)
Synchronous vs Asynchronous
Currently, the SDK provides two ways to transcribe audio files.
The synchronous approach halts the application's flow until the transcription has been completed.
The asynchronous approach allows the application to continue running while the transcription is being processed. The caller receives a concurrent.futures.Future
object which can be used to check the status of the transcription at a later time.
You can identify those two approaches by the _async
suffix in the Transcriber
's method name (e.g. transcribe
vs transcribe_async
).
Polling Intervals
By default we poll the Transcript
's status each 3s
. In case you would like to adjust that interval:
import assemblyai as aai
aai.settings.polling_interval = 1.0
Retrieving Existing Transcripts
Retrieving a Single Transcript
If you previously created a transcript, you can use its ID to retrieve it later.
import assemblyai as aai
transcript = aai.Transcript.get_by_id("<TRANSCRIPT_ID>")
print(transcript.id)
print(transcript.text)
Retrieving Multiple Transcripts as a Group
You can also retrieve multiple existing transcripts and combine them into a single TranscriptGroup
object. This allows you to perform operations on the transcript group as a single unit, such as querying the combined transcripts with LeMUR.
import assemblyai as aai
transcript_group = aai.TranscriptGroup.get_by_ids(["<TRANSCRIPT_ID_1>", "<TRANSCRIPT_ID_2>"])
summary = transcript_group.lemur.summarize(context="Customers asking for cars", answer_format="TLDR")
print(summary)
Retrieving Transcripts Asynchronously
Both Transcript.get_by_id
and TranscriptGroup.get_by_ids
have asynchronous counterparts, Transcript.get_by_id_async
and TranscriptGroup.get_by_ids_async
, respectively. These functions immediately return a Future
object, rather than blocking until the transcript(s) are retrieved.
See the above section on Synchronous vs Asynchronous for more information.