feat: long press voice button to pick audio file from device by cookie223 · Pull Request #5 · memex-lab/memex

cookie223 · 2026-04-10T20:03:27Z

Summary

Adds long press gesture on the mic/voice button in the input sheet
Opens a file picker to select an existing audio file from device storage
Supported formats: m4a, mp3, wav, ogg
Adds i18n strings for English and Chinese (error messages and audio label)
Shows a snackbar if user tries to pick while recording is active

How it works

Gesture	Action
Tap	Start/stop recording (existing behavior)
Long press	Open file picker to select an audio file

Testing

Tested on real Android device (Samsung SM F966U1, Android 16)
Please test on iOS — not verified by the author; the file_picker package supports iOS but behavior on iOS has not been confirmed

Notes

This uses the existing file_picker dependency. No new packages were added.

🤖 Generated with Claude Code

Adds long press gesture on the mic/voice button in the input sheet to open a file picker and select an existing audio file (m4a, mp3, wav, ogg) instead of recording a new one. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

tigerlaibao · 2026-04-11T08:01:48Z

Thanks a lot for your contribution to the project!

I’m currently working on a feature to support a local speech-to-text (STT) model, specifically SenseVoiceSmall. The reason for this is that most LLMs don't natively support direct audio input. By using a standalone local model, we can handle transcription without requiring additional API keys.

This feature might have some code conflicts with your PR. To help me test if the local model can handle your use case effectively, could you let me know the typical size of the audio files you usually upload?

Really appreciate you helping out with the development!

cookie223 · 2026-04-11T12:24:32Z

For my own use case it is usually voice memo/voice recorder recorded audio. They won't be very large, at most a few MB.
Would it be feasible to make the model for audio input selectable in the setting? Like the media processing model? I would much prefer to use Gemini which is natively multimodal and the free api key is enough for daily use.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: long press voice button to pick audio file from device#5

feat: long press voice button to pick audio file from device#5
cookie223 wants to merge 1 commit intomemex-lab:mainfrom
cookie223:feat/long-press-audio-picker

cookie223 commented Apr 10, 2026

Uh oh!

tigerlaibao commented Apr 11, 2026

Uh oh!

cookie223 commented Apr 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

cookie223 commented Apr 10, 2026

Summary

How it works

Testing

Notes

Uh oh!

tigerlaibao commented Apr 11, 2026

Uh oh!

cookie223 commented Apr 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants