Back to SrizonVoice for macOS
Android App v1.1.0

SrizonVoice for Android

System-wide voice dictation for Android built around Whisper Large v3 instead of Gboard’s real-time engine — much better accuracy on long-form dictation. Tap the floating bubble to start, tap again to stop, and the transcript lands at your cursor in whatever app you’re in.

Requires Android 12 (API 31) or later • BYOK (free tier on Groq) • install instructions

Not on Google Play (yet)

Accessibility-Service apps need a separate Play Store policy review, so for now SrizonVoice ships as a sideload from GitHub Releases. The APK is signed with a stable release key — you can install future updates in-place without uninstalling.

Android will warn you about installing from an unknown source; that’s normal for sideloaded apps. See the install instructions below for the two-tap walkthrough.

Why not just use Gboard?

Android already has voice typing built into the keyboard. It’s fine for one-sentence replies, and falls apart on anything longer. Here’s the gap SrizonVoice is filling.

Gboard: real-time streaming

Google’s built-in voice typing transcribes as you speak, in real time. It optimizes for latency, so on long recordings it drops chunks, restarts mid-sentence, garbles proper nouns, and never sees the full sentence in context. Good enough for a quick reply, painful for a paragraph.

SrizonVoice: batch with Whisper v3

Records the whole take first, then sends it to OpenAI’s Whisper Large v3 in one shot. The model gets every word in context — so punctuation, paragraphs, technical terms, code, and names all hold up. Optional Gemini pass on top to clean up filler words.

The tradeoff: you wait a second or two after releasing instead of watching words appear live. For anything longer than a sentence, it’s worth it.

What’s in the Box

Same Groq Whisper pipeline as the macOS app, adapted for Android idioms

Tap-to-Dictate (Handsfree)

Tap the floating bubble once to start, tap again to stop. Default mode — better for long dictations than holding a finger on a tiny target. Optional auto-stop on silence.

Push-to-Talk Option

Prefer the macOS-style hold? Flip a setting to switch the bubble to push-to-talk: hold to record, release to transcribe, drag up to cancel.

System-wide Text Insertion

An Accessibility Service inserts your transcript directly into the focused text field. Falls back to clipboard paste when the field doesn’t expose proper accessibility nodes.

AI Post-Processing

Optional Gemini pass cleans up filler words and punctuation without changing meaning — same prompt as the macOS app.

107 Languages

Dictate in any of 107 supported languages. Recently-used languages appear at the top of the picker for quick switching.

Live Waveform

30-bar live waveform with the same coral → purple → blue gradient as the macOS app, driven by RMS at 60 Hz.

Bring Your Own Key

Your Groq (and optional Gemini) API key is stored in EncryptedSharedPreferences. Audio is sent directly to Groq — no Srizon servers in the middle.

How It Works

Three steps to hands-free typing, from any Android app

Tap to Start

Tap the floating bubble from any app to start recording. Hold instead if you've switched to push-to-talk mode.

Speak Naturally

A live 30-bar waveform reacts to your voice. Audio is captured at 16 kHz mono — Whisper’s native rate.

Tap Again to Send

Tap once more (or auto-stop on silence). Whisper transcribes, Gemini optionally cleans up, and the Accessibility Service inserts text at your cursor.

Install on Your Device

Two paths: grab the prebuilt APK (recommended), or build it yourself from source

Option A · Recommended

Install the prebuilt APK

1

Download the APK

Grab srizonvoice-1.1.0.apk (~46 MB) directly from GitHub Releases. You can download it on your phone, or on a computer and transfer it across.

2

Allow installs from your browser / file manager

Open Settings → Apps → Special access → Install unknown apps on your phone, pick the app you used to download the APK (Chrome, Files, Drive, etc.), and toggle Allow from this source.

3

Open the APK and install

Tap the downloaded file. Android will show a confirmation sheet — review it and tap Install. Already on an earlier SrizonVoice release? This installs as an in-place upgrade (same signing key), so no uninstall needed.

4

Walk through onboarding

Paste a Groq key (free tier at console.groq.com), optionally a Gemini key, then grant Microphone, Notifications, Display-over-other-apps, and Accessibility permissions when prompted.

or, for the curious

Option B · For developers

Build from source

1

Clone the repo

Clone github.com/AfzalH/voice-android or download the source as a ZIP.

2

Open in Android Studio

Android Studio Ladybug (2024.2.1) or later. It bundles JDK 17 and Gradle 8.10 and will generate the missing gradle-wrapper.jar on first sync.

3

Build a debug APK

Run ./gradlew assembleDebug — or use Build → Build Bundle(s) / APK(s) → Build APK(s) in the IDE.

4

Install over ADB

Enable Developer options → USB debugging, plug in a device running Android 12+ (API 31+), and run adb install app/build/outputs/apk/debug/app-debug.apk. Or just hit Run in Android Studio.

# quick start
git clone https://github.com/AfzalH/voice-android.git
cd voice-android
./gradlew assembleDebug
adb install app/build/outputs/apk/debug/app-debug.apk

Heads up:debug builds are signed with Android’s debug key, so they conflict with the release APK. You’ll need to uninstall one before installing the other.

Requirements

Device & Permissions

  • Android 12 (API 31) or later
  • Microphone permission
  • Notifications permission
  • Display over other apps (for the bubble)
  • Accessibility Service (for text insertion)

Build & API Keys

  • Android Studio Ladybug (2024.2.1) or later
  • Android SDK platform 35 + build-tools 35.0.0
  • Groq API key (required) — free tier available
  • Gemini API key (optional) — for post-processing

Privacy & Security

Audio is only captured during an active dictation session — from the tap that starts recording to the tap (or auto-stop) that ends it — and is sent directly from your device to Groq for transcription. If you enable post-processing, the transcript is then sent to Google Gemini. No data passes through Srizon servers, and your API key is stored in EncryptedSharedPreferences.

See Groq Privacy Policy and Google Privacy Policy for AI provider policies.

Try It on Your Device

Download the APK, grant a few permissions, paste a Groq key — you’ll be dictating into any Android app inside two minutes.