Back to SrizonVoice for macOS
Android App v2.0.0

SrizonVoice for Android

System-wide voice dictation for Android built around Google Gemini. Automatic spoken-language detection works across all major languages, and the optional Translate action can return a target-language output. Tap the floating bubble to start, tap again to stop, and the result lands at your cursor in whatever app you’re in.

Free to download and use. Bring your own Gemini API key; typical dictation usage should cost very, very little.

Requires Android 12 (API 31) or later • only Gemini API usage, usually very low • install instructions

Not on Google Play (yet)

Accessibility-Service apps need a separate Play Store policy review, so for now SrizonVoice ships as a sideload from GitHub Releases. The APK is signed with a stable release key — you can install future updates in-place without uninstalling.

Android will warn you about installing from an unknown source; that’s normal for sideloaded apps. See the install instructions below for the two-tap walkthrough.

Why not just use Gboard?

Android already has voice typing built into the keyboard. It’s fine for one-sentence replies, and falls apart on anything longer. Here’s the gap SrizonVoice is filling.

Gboard: real-time streaming

Google’s built-in voice typing transcribes as you speak, in real time. It optimizes for latency, so on long recordings it drops chunks, restarts mid-sentence, garbles proper nouns, and never sees the full sentence in context. Good enough for a quick reply, painful for a paragraph.

SrizonVoice: batch with Gemini

Records the whole take first, then sends it to Gemini in one shot. The model gets every word in context, so punctuation, paragraphs, technical terms, code, and names hold up. It can also clean up filler words or translate into a target language.

The tradeoff: you wait a second or two after releasing instead of watching words appear live. For anything longer than a sentence, it’s worth it.

What’s in the Box

Same Gemini-first transcription and translation pipeline as the macOS app, adapted for Android idioms

Tap-to-Dictate (Handsfree)

Tap the floating bubble once to start, tap again to stop. Default mode is better for long dictations than holding a finger on a tiny target, with a seconds-based auto-stop as a guardrail.

Push-to-Talk Option

Prefer the macOS-style hold? Flip a setting to switch the bubble to push-to-talk: hold to record, release to transcribe, drag up to cancel.

System-wide Text Insertion

An Accessibility Service inserts your transcript directly into the focused text field. Falls back to clipboard paste when the field doesn’t expose proper accessibility nodes.

Transcription + Translation

Gemini handles transcription, correction, custom prompts, and translation in one request, using the same output modes as the macOS app.

All Major Languages

Speak in all major languages without choosing a source language. Gemini uses automatic spoken-language detection from the audio.

Live Waveform

30-bar live waveform with the same coral → purple → blue gradient as the macOS app, driven by RMS at 60 Hz.

Bring Your Own Key

The app is free. Your Gemini API key is stored in EncryptedSharedPreferences; you only pay Google Gemini API usage, which should be very low for typical dictation.

How It Works

Three steps to hands-free typing, from any Android app

Tap to Start

Tap the floating bubble from any app to start recording. Hold instead if you've switched to push-to-talk mode.

Speak Naturally

A live 30-bar waveform reacts to your voice. Audio is captured at 16 kHz mono before it is sent to Gemini.

Tap Again to Send

Tap once more, choose Translate when needed, or let auto-stop end a long take. Gemini returns the final text for insertion at your cursor.

Install on Your Device

Two paths: grab the prebuilt APK (recommended), or build it yourself from source

Option A · Recommended

Install the prebuilt APK

1

Download the APK

Grab app-release.apk (~46 MB) directly from GitHub Releases. You can download it on your phone, or on a computer and transfer it across.

2

Allow installs from your browser / file manager

Open Settings → Apps → Special access → Install unknown apps on your phone, pick the app you used to download the APK (Chrome, Files, Drive, etc.), and toggle Allow from this source.

3

Open the APK and install

Tap the downloaded file. Android will show a confirmation sheet — review it and tap Install. Already on an earlier SrizonVoice release? This installs as an in-place upgrade (same signing key), so no uninstall needed.

4

Walk through onboarding

Paste a Gemini key from Google AI Studio, then grant Microphone, Notifications, Display-over-other-apps, and Accessibility permissions when prompted.

or, for the curious

Option B · For developers

Build from source

1

Clone the repo

Clone github.com/AfzalH/voice-android or download the source as a ZIP.

2

Open in Android Studio

Android Studio Ladybug (2024.2.1) or later. It bundles JDK 17 and Gradle 8.10 and will generate the missing gradle-wrapper.jar on first sync.

3

Build a debug APK

Run ./gradlew assembleDebug — or use Build → Build Bundle(s) / APK(s) → Build APK(s) in the IDE.

4

Install over ADB

Enable Developer options → USB debugging, plug in a device running Android 12+ (API 31+), and run adb install app/build/outputs/apk/debug/app-debug.apk. Or just hit Run in Android Studio.

# quick start
git clone https://github.com/AfzalH/voice-android.git
cd voice-android
./gradlew assembleDebug
adb install app/build/outputs/apk/debug/app-debug.apk

Heads up:debug builds are signed with Android’s debug key, so they conflict with the release APK. You’ll need to uninstall one before installing the other.

Requirements

Device & Permissions

  • Android 12 (API 31) or later
  • Microphone permission
  • Notifications permission
  • Display over other apps (for the bubble)
  • Accessibility Service (for text insertion)

Build & API Keys

  • Android Studio Ladybug (2024.2.1) or later
  • Android SDK platform 35 + build-tools 35.0.0
  • Gemini API key (required; usage paid to Google, usually very low)

Privacy & Security

Audio is only captured during an active dictation session — from the tap that starts recording to the tap (or auto-stop) that ends it — and is sent directly from your device to Google Gemini for transcription, correction, and optional translation. No data passes through Srizon servers, and your API key is stored in EncryptedSharedPreferences.

See Google Privacy Policy for Google's provider policy.

Try It on Your Device

Download the APK, grant a few permissions, paste a Gemini key — you’ll be dictating into any Android app inside two minutes. The app is free; your only cost is Gemini API usage.