io.github.eslamwael74.speechtotextkit:speechToText

Package Overview

Dependencies

Maintainers

Alerts

File Explorer

Advanced tools

License

Install Socket

Detect and block malicious and high-risk dependencies

Install

io.github.eslamwael74.speechtotextkit:speechToText

A Kotlin Multiplatform library for speech-to-text functionality.

Source

Maven Central

Version: 1.0.0

Version published: 7 months ago

Maintainers: 1

Source

🎙️ SpeechToTextKit

SpeechToTextKit is a Kotlin Multiplatform library that provides a simple and unified API for speech-to-text functionality across multiple platforms: Android, iOS, Desktop (JVM), and Web (Wasm).

📋 Current Features

Cross-Platform Support: Works on Android and iOS
Reactive API: Receive speech recognition results as a Flow
Compose Integration: Easy to use with Jetpack Compose via rememberSpeechToText()
Seamless Integration: Integrates easily with existing KMP applications
State Callbacks: Monitor recognition state changes through Flow
Low Friction Setup: Minimal dependencies and configuration required
Error Handling: Detailed error reporting through the result API

📸 Screenshots

Basic Speech to Text Field Custom Speech to Text Field Custom Speech to Text Field with multiple lines and listening

🚀 Installation

in progress... Add the following to your settings.gradle.kts:

dependencyResolutionManagement {
    repositories {
        // ...
        maven { url = uri("https://jitpack.io") }
    }
}

Then add the dependency to your module's build.gradle.kts:

dependencies {
    // Core library
    implementation("com.github.eslamwael74.speechtotextkit:speechToText:1.0.0")
    
    // Optional: Compose UI components
    implementation("com.github.eslamwael74.speechtotextkit:speechToTextCompose:1.0.0")
}

📱 Usage

There are currently two ways to use this library:

1. In Jetpack Compose

import androidx.compose.material.Button
import androidx.compose.material.Text
import androidx.compose.runtime.*
import com.eslamwael74.speechtotextcompose.rememberSpeechToText

@Composable
fun SpeechRecognitionScreen() {
    var recognizedText by remember { mutableStateOf("") }
    val speechRecognizer = rememberSpeechToText()
    var isListening by remember { mutableStateOf(false) }
    
    LaunchedEffect(Unit) {
        speechRecognizer.results.collect { result ->
            recognizedText = result.text
        }
    }
    
    Column(modifier = Modifier.fillMaxSize().padding(16.dp)) {
        Text(
            text = recognizedText.ifEmpty { "Tap the button and speak" },
            modifier = Modifier.weight(1f)
        )
        
        Button(onClick = {
            if (isListening) {
                // Stop listening
                speechRecognizer.stopListening()
                isListening = false
            } else {
                // Start listening
                speechRecognizer.startListening()
                isListening = true
            }
        }) {
            Text(if (isListening) "Stop Listening" else "Start Listening")
        }
    }
}

2. In a ViewModel

import com.eslamwael74.speechtotext.SpeechRecognizer
import com.eslamwael74.speechtotext.SpeechRecognizerFactory
import kotlinx.coroutines.flow.launchIn
import kotlinx.coroutines.flow.onEach

// Using dependency injection
class YourViewModel(
    private val speechRecognizer: SpeechRecognizer
) {
    init {
        // Listen for speech recognition results
        speechRecognizer.results.onEach { result ->
            // Handle result
            println("Recognized text: ${result.text}")
        }.launchIn(viewModelScope)
        
        // Monitor state changes
        speechRecognizer.state.onEach { state ->
            // Handle state changes
            println("Recognition state: $state")
        }.launchIn(viewModelScope)
    }
    
    fun startListening() {
        viewModelScope.launch {
            speechRecognizer.startListening()
        }
    }
    
    fun stopListening() {
        viewModelScope.launch {
            speechRecognizer.stopListening()
        }
    }
    
    fun cleanup() {
        speechRecognizer.destroy()
    }
}

// Example of factory/provider to create the SpeechRecognizer
class SpeechRecognizerProvider(
    private val applicationContext: Context
) {
    fun provideSpeechRecognizer(): SpeechRecognizer {
        return SpeechRecognizerFactory(applicationContext).createSpeechRecognizer()
    }
}

// Usage with Manual DI
class YourActivity : AppCompatActivity() {
    private val speechRecognizerProvider by lazy {
        SpeechRecognizerProvider(applicationContext)
    }
    
    private val viewModel by viewModels {
        viewModelFactory { 
            YourViewModel(speechRecognizerProvider.provideSpeechRecognizer())
        }
    }
}

// Or with Hilt/Dagger
@Module
@InstallIn(SingletonComponent::class)
object SpeechModule {
    @Provides
    @Singleton
    fun provideSpeechRecognizer(@ApplicationContext context: Context): SpeechRecognizer {
        return SpeechRecognizerFactory(context).createSpeechRecognizer()
    }
}

📝 Platform-Specific Setup

Android

Add the following permission to your AndroidManifest.xml:

<uses-permission android:name="android.permission.RECORD_AUDIO" />

You'll also need to request this permission at runtime.

iOS

Add the following to your Info.plist:

<key>NSMicrophoneUsageDescription</key>
<string>This app needs access to your microphone for speech recognition</string>
<key>NSSpeechRecognitionUsageDescription</key>
<string>This app uses speech recognition to convert your speech to text</string>

🚧 Upcoming Features

The following features are planned but not yet implemented:

Compose UI component with built-in microphone button
Customizable recognition parameters (language, timeout, etc.)
Offline recognition support where available
Improved error handling and recovery
Text-to-Speech capabilities
Support requesting permissions for Android at runtime
TextField Composable with integrated microphone button
Support for more languages and dialects
Support WebAssembly (Wasm) for web applications
Support for desktop platforms (JVM)
Support for macOS

🧪 Example App

Check out the included example app in the /example directory for a complete implementation of SpeechToTextKit.

🙌 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Fork the Project
Create your Feature Branch (git checkout -b feature/amazing-feature)
Commit your Changes (git commit -m 'Add some amazing feature')
Push to the Branch (git push origin feature/amazing-feature)
Open a Pull Request

📄 License

Distributed under the Apache 2.0 License. See LICENSE for more information.

📞 Contact

Eslam Wael - @eslamwael74

Project Link: https://github.com/eslamwael74/speechtotextkit

FAQs

What is io.github.eslamwael74.speechtotextkit:speechToText?

Is io.github.eslamwael74.speechtotextkit:speechToText well maintained?

Package last updated on 10 Jun 2025

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

io.github.eslamwael74.speechtotextkit:speechToText

🎙️ SpeechToTextKit

📋 Current Features

📸 Screenshots

🚀 Installation

📱 Usage

1. In Jetpack Compose

2. In a ViewModel

📝 Platform-Specific Setup

Android

iOS

🚧 Upcoming Features

🧪 Example App

🙌 Contributing

📄 License

📞 Contact

Related posts

Insecure Agents Podcast: Certified Patches, Supply Chain Security, and AI Agents

Tailwind CSS Announces 75% Layoffs as LLMs Reshape OSS Business Models