daily_bot livekit_bot agora_bot
| e.g.: daily_room_audio_stream | livekit_room_audio_stream, sense_voice_asr, groq | together api llm(text), tts_edge |  | CPU (free, 2 cores) | e.g.: daily | livekit room in stream -> silero (vad) -> sense_voice (asr) -> groq | together (llm) -> edge (tts) -> daily | livekit room out stream |
| generate_audio2audio | remote_queue_chat_bot_be_worker |  | T4(free) | e.g.: pyaudio in stream -> silero (vad) -> sense_voice (asr) -> qwen (llm) -> cosy_voice (tts) -> pyaudio out stream |
daily_describe_vision_tools_bot livekit_describe_vision_tools_bot agora_describe_vision_tools_bot
| e.g.: daily_room_audio_stream |livekit_room_audio_stream deepgram_asr, goole_gemini, tts_edge |  | CPU(free, 2 cores) | e.g.: daily |livekit room in stream -> silero (vad) -> deepgram (asr) -> google gemini -> edge (tts) -> daily |livekit room out stream |
daily_describe_vision_bot livekit_describe_vision_bot agora_describe_vision_bot
| e.g.: daily_room_audio_stream | livekit_room_audio_stream sense_voice_asr, llm_transformers_manual_vision_qwen, tts_edge | achatbot_vision_qwen_vl.ipynb:
 achatbot_vision_janus.ipynb:
 achatbot_vision_minicpmo.ipynb:
 achatbot_kimivl.ipynb:
 achatbot_phi4_multimodal.ipynb:
 | - Qwen2-VL-2B-Instruct T4(free) - Qwen2-VL-7B-Instruct L4 - Llama-3.2-11B-Vision-Instruct L4 - allenai/Molmo-7B-D-0924 A100 | e.g.: daily | livekit room in stream -> silero (vad) -> sense_voice (asr) -> qwen-vl (llm) -> edge (tts) -> daily | livekit room out stream |
daily_chat_vision_bot livekit_chat_vision_bot agora_chat_vision_bot
| e.g.: daily_room_audio_stream |livekit_room_audio_stream sense_voice_asr, llm_transformers_manual_vision_qwen, tts_edge |  | - Qwen2-VL-2B-Instruct T4(free) - Qwen2-VL-7B-Instruct L4 - Ll ama-3.2-11B-Vision-Instruct L4 - allenai/Molmo-7B-D-0924 A100 | e.g.: daily | livekit room in stream -> silero (vad) -> sense_voice (asr) -> llm answer guide qwen-vl (llm) -> edge (tts) -> daily | livekit room out stream |
daily_chat_tools_vision_bot livekit_chat_tools_vision_bot agora_chat_tools_vision_bot
| e.g.: daily_room_audio_stream | livekit_room_audio_stream sense_voice_asr, groq api llm(text), tools: - llm_transformers_manual_vision_qwen, tts_edge |  | - Qwen2-VL-2B-Instruct<br /> T4(free) - Qwen2-VL-7B-Instruct L4 - Llama-3.2-11B-Vision-Instruct L4 - allenai/Molmo-7B-D-0924 A100 | e.g.: daily | livekit room in stream -> silero (vad) -> sense_voice (asr) ->llm with tools qwen-vl -> edge (tts) -> daily | livekit room out stream |
daily_annotate_vision_bot livekit_annotate_vision_bot agora_annotate_vision_bot
| e.g.: daily_room_audio_stream | livekit_room_audio_stream vision_yolo_detector tts_edge |  | T4(free) | e.g.: daily | livekit room in stream vision_yolo_detector -> edge (tts) -> daily | livekit room out stream |
daily_detect_vision_bot livekit_detect_vision_bot agora_detect_vision_bot
| e.g.: daily_room_audio_stream | livekit_room_audio_stream vision_yolo_detector tts_edge |  | T4(free) | e.g.: daily | livekit room in stream vision_yolo_detector -> edge (tts) -> daily | livekit room out stream |
daily_ocr_vision_bot livekit_ocr_vision_bot agora_ocr_vision_bot
| e.g.: daily_room_audio_stream | livekit_room_audio_stream sense_voice_asr, vision_transformers_got_ocr tts_edge |  | T4(free) | e.g.: daily | livekit room in stream -> silero (vad) -> sense_voice (asr) vision_transformers_got_ocr -> edge (tts) -> daily | livekit room out stream |
| daily_month_narration_bot | e.g.: daily_room_audio_stream groq |together api llm(text), hf_sd, together api (image) tts_edge |  | when use sd model with diffusers T4(free) cpu+cuda (slow) L4 cpu+cuda A100 all cuda
| e.g.: daily room in stream -> together (llm) -> hf sd gen image model -> edge (tts) -> daily room out stream |
| daily_storytelling_bot | e.g.: daily_room_audio_stream groq |together api llm(text), hf_sd, together api (image) tts_edge |  | cpu (2 cores) when use sd model with diffusers T4(free) cpu+cuda (slow) L4 cpu+cuda A100 all cuda
| e.g.: daily room in stream -> together (llm) -> hf sd gen image model -> edge (tts) -> daily room out stream |
websocket_server_bot fastapi_websocket_server_bot
| e.g.: websocket_server sense_voice_asr, groq |together api llm(text), tts_edge |  | cpu(2 cores) | e.g.: websocket protocol in stream -> silero (vad) -> sense_voice (asr) -> together (llm) -> edge (tts) -> websocket protocol out stream |
| daily_natural_conversation_bot | e.g.: daily_room_audio_stream sense_voice_asr, groq |together api llm(NLP task), gemini-1.5-flash (chat) tts_edge |  | cpu(2 cores) | e.g.: daily room in stream -> together (llm NLP task) -> gemini-1.5-flash model (chat) -> edge (tts) -> daily room out stream |
| fastapi_websocket_moshi_bot | e.g.: websocket_server moshi opus stream voice llm
|  | L4/A100 | websocket protocol in stream -> silero (vad) -> moshi opus stream voice llm -> websocket protocol out stream |
daily_asr_glm_voice_bot daily_glm_voice_bot
| e.g.: daily_room_audio_stream glm voice llm
|  | T4/L4/A100 | e.g.: daily room in stream ->glm4-voice -> daily room out stream |
| daily_freeze_omni_voice_bot | e.g.: daily_room_audio_stream freezeOmni voice llm
|  | L4/A100 | e.g.: daily room in stream ->freezeOmni-voice -> daily room out stream |
daily_asr_minicpmo_voice_bot daily_minicpmo_voice_bot daily_minicpmo_vision_voice_bot
| e.g.: daily_room_audio_stream minicpmo llm
|  | T4: MiniCPM-o-2_6-int4 L4/A100: MiniCPM-o-2_6
| e.g.: daily room in stream ->minicpmo -> daily room out stream |
livekit_asr_qwen2_5omni_voice_bot livekit_qwen2_5omni_voice_bot livekit_qwen2_5omni_vision_voice_bot
| e.g.: livekit_room_audio_stream qwen2.5omni llm
|  | A100 | e.g.: livekit room in stream ->qwen2.5omni -> livekit room out stream |
livekit_asr_kimi_voice_bot livekit_kimi_voice_bot
| e.g.: livekit_room_audio_stream kimi audio llm
|  | A100 | e.g.: livekit room in stream -> Kimi-Audio -> livekit room out stream |
livekit_asr_vita_voice_bot livekit_vita_voice_bot
| e.g.: livekit_room_audio_stream vita audio llm
|  | L4/100 | e.g.: livekit room in stream -> VITA-Audio -> livekit room out stream |
daily_phi4_voice_bot daily_phi4_vision_speech_bot
| e.g.: daily_room_audio_stream phi4-multimodal llm
|  | L4/100 | e.g.: daily room in stream -> phi4-multimodal -> edge (tts) -> daily room out stream |
daliy_multi_mcp_bot livekit_multi_mcp_bot agora_multi_mcp_bot
| e.g.: agora_channel_audio_stream |daily_room_audio_stream |livekit_room_audio_stream, sense_voice_asr, groq |together api llm(text), mcp tts_edge |  | CPU (free, 2 cores) | e.g.: agora | daily |livekit room in stream -> silero (vad) -> sense_voice (asr) -> groq |together (llm) -> mcp server tools -> edge (tts) -> daily |livekit room out stream |
daily_liteavatar_chat_bot daily_liteavatar_echo_bot livekit_musetalk_chat_bot livekit_musetalk_echo_bot
| e.g.: agora_channel_audio_stream |daily_room_audio_stream |livekit_room_audio_stream, sense_voice_asr, groq |together api llm(text), tts_edge avatar
| achatbot_avatar_musetalk.ipynb:
 | CPU/T4/L4 | e.g.: agora |daily |livekit room in stream -> silero (vad) -> sense_voice (asr) -> groq |together (llm) -> edge (tts) -> avatar -> daily |livekit room out stream |
| | | | |