System-wide voice dictation for Android built around Google Gemini. Automatic spoken-language detection works across all major languages, and the optional Translate action can return a target-language output. Tap the floating bubble to start, tap again to stop, and the result lands at your cursor in whatever app you’re in.
Requires Android 12 (API 31) or later • only Gemini API usage, usually very low • install instructions
Accessibility-Service apps need a separate Play Store policy review, so for now SrizonVoice ships as a sideload from GitHub Releases. The APK is signed with a stable release key — you can install future updates in-place without uninstalling.
Android will warn you about installing from an unknown source; that’s normal for sideloaded apps. See the install instructions below for the two-tap walkthrough.
Android already has voice typing built into the keyboard. It’s fine for one-sentence replies, and falls apart on anything longer. Here’s the gap SrizonVoice is filling.
Google’s built-in voice typing transcribes as you speak, in real time. It optimizes for latency, so on long recordings it drops chunks, restarts mid-sentence, garbles proper nouns, and never sees the full sentence in context. Good enough for a quick reply, painful for a paragraph.
Records the whole take first, then sends it to Gemini in one shot. The model gets every word in context, so punctuation, paragraphs, technical terms, code, and names hold up. It can also clean up filler words or translate into a target language.
The tradeoff: you wait a second or two after releasing instead of watching words appear live. For anything longer than a sentence, it’s worth it.
Same Gemini-first transcription and translation pipeline as the macOS app, adapted for Android idioms
Tap the floating bubble once to start, tap again to stop. Default mode is better for long dictations than holding a finger on a tiny target, with a seconds-based auto-stop as a guardrail.
Prefer the macOS-style hold? Flip a setting to switch the bubble to push-to-talk: hold to record, release to transcribe, drag up to cancel.
An Accessibility Service inserts your transcript directly into the focused text field. Falls back to clipboard paste when the field doesn’t expose proper accessibility nodes.
Gemini handles transcription, correction, custom prompts, and translation in one request, using the same output modes as the macOS app.
Speak in all major languages without choosing a source language. Gemini uses automatic spoken-language detection from the audio.
30-bar live waveform with the same coral → purple → blue gradient as the macOS app, driven by RMS at 60 Hz.
The app is free. Your Gemini API key is stored in EncryptedSharedPreferences; you only pay Google Gemini API usage, which should be very low for typical dictation.
Three steps to hands-free typing, from any Android app
Tap the floating bubble from any app to start recording. Hold instead if you've switched to push-to-talk mode.
A live 30-bar waveform reacts to your voice. Audio is captured at 16 kHz mono before it is sent to Gemini.
Tap once more, choose Translate when needed, or let auto-stop end a long take. Gemini returns the final text for insertion at your cursor.
Two paths: grab the prebuilt APK (recommended), or build it yourself from source
Option A · Recommended
Grab app-release.apk (~46 MB) directly from GitHub Releases. You can download it on your phone, or on a computer and transfer it across.
Open Settings → Apps → Special access → Install unknown apps on your phone, pick the app you used to download the APK (Chrome, Files, Drive, etc.), and toggle Allow from this source.
Tap the downloaded file. Android will show a confirmation sheet — review it and tap Install. Already on an earlier SrizonVoice release? This installs as an in-place upgrade (same signing key), so no uninstall needed.
Paste a Gemini key from Google AI Studio, then grant Microphone, Notifications, Display-over-other-apps, and Accessibility permissions when prompted.
Option B · For developers
Clone github.com/AfzalH/voice-android or download the source as a ZIP.
Android Studio Ladybug (2024.2.1) or later. It bundles JDK 17 and Gradle 8.10 and will generate the missing gradle-wrapper.jar on first sync.
Run ./gradlew assembleDebug — or use Build → Build Bundle(s) / APK(s) → Build APK(s) in the IDE.
Enable Developer options → USB debugging, plug in a device running Android 12+ (API 31+), and run adb install app/build/outputs/apk/debug/app-debug.apk. Or just hit Run in Android Studio.
Heads up:debug builds are signed with Android’s debug key, so they conflict with the release APK. You’ll need to uninstall one before installing the other.
Audio is only captured during an active dictation session — from the tap that starts recording to the tap (or auto-stop) that ends it — and is sent directly from your device to Google Gemini for transcription, correction, and optional translation. No data passes through Srizon servers, and your API key is stored in EncryptedSharedPreferences.
See Google Privacy Policy for Google's provider policy.
Download the APK, grant a few permissions, paste a Gemini key — you’ll be dictating into any Android app inside two minutes. The app is free; your only cost is Gemini API usage.