Skip to content

Fix Chinese input and Apple ASR locale selection#254

Open
stego-z wants to merge 2 commits intoaltic-dev:mainfrom
stego-z:fix-chinese-input-and-apple-asr-locale
Open

Fix Chinese input and Apple ASR locale selection#254
stego-z wants to merge 2 commits intoaltic-dev:mainfrom
stego-z:fix-chinese-input-and-apple-asr-locale

Conversation

@stego-z
Copy link
Copy Markdown

@stego-z stego-z commented Apr 4, 2026

Summary

  • force CJK text insertion to use reliable paste for better IME compatibility
  • make Apple Speech and Apple Speech Analyzer respect the in-app language selection
  • split Chinese selection into Simplified Chinese and Traditional Chinese
  • improve SpeechAnalyzer locale fallback handling

Notes

  • local signing and package path changes were intentionally excluded from this PR

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3c78766907

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

}

if model == .cohereTranscribeSixBit {
if model == .cohereTranscribeSixBit || model == .appleSpeech || model == .appleSpeechAnalyzer {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Limit Apple Speech Analyzer language picker to supported locales

Including .appleSpeechAnalyzer in this shared picker exposes all CohereLanguage values, but the analyzer is documented in SettingsStore.SpeechModel.languageSupport as supporting only EN/ES/FR/DE/IT/JA/KO/PT/ZH. Selecting unsupported options (e.g., Arabic, Greek, Dutch, Polish, Vietnamese) is silently ignored because AppleSpeechAnalyzerProvider.resolvedRecognitionLocale() falls back to English, so users can believe they set a target language while getting English transcription instead. Restrict the menu for Apple Speech Analyzer to its supported locales (or surface an explicit validation/error) so selection reflects actual runtime behavior.

Useful? React with 👍 / 👎.

@altic-dev
Copy link
Copy Markdown
Owner

@codex can we merge this PR?

@chatgpt-codex-connector
Copy link
Copy Markdown

To use Codex here, create an environment for this repo.

@altic-dev
Copy link
Copy Markdown
Owner

hey thanks for the PR ; was these changes intentional ? would love your comment on here.

What it should do ideally:

If the user is using Chinese, recommend Whisper.
If the user uses Apple Speech / Apple Speech Analyzer / Cohere, use the app’s chosen language.
If the dictated text contains CJK characters, force reliable paste for insertion.
What it is doing instead:

It is not just recommending Whisper.
It changes the actual returned selected model if macOS preferred languages contain Chinese.
So a user may have chosen Parakeet, but the app can silently behave as if Whisper is selected.
In onboarding, even if the user picks English only or Multiple languages, the code can still push Whisper first just because Chinese exists in macOS language preferences.

I am down foe the paste method etc and splitting different Chinese types. but recommending model seems too biased. Do you have any model on top of your mind for Chinese? I can try to add it actually. Qwen ASR or anything else? Thanks.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: be764f239d

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".


enum SpeechLocaleResolver {
static var prefersChineseRecognition: Bool {
Locale.preferredLanguages.contains { Self.languageCode(from: $0) == "zh" }
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Base Chinese-default detection on primary locale

prefersChineseRecognition now returns true if any preferred language starts with zh, and SpeechModel.defaultModel uses this flag to switch Apple Silicon users to Whisper Base. This means users whose primary language is not Chinese but who keep Chinese as a secondary fallback will still be treated as “Chinese-first” and get a different default model path than intended. Restrict this check to the highest-priority preferred locale (or an explicit Chinese selection) so secondary language fallbacks do not silently change onboarding/default model behavior.

Useful? React with 👍 / 👎.

Comment on lines +3252 to +3253
if self.defaults.string(forKey: Keys.selectedCohereLanguage) == "zh" {
return .traditionalChinese
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Migrate legacy zh setting without forcing zh-TW

The migration path for older installs maps stored SelectedCohereLanguage == "zh" directly to .traditionalChinese. Because SpeechLocaleResolver.locale(for:) maps that to zh-TW, existing users who previously chose generic Mandarin are now forced onto the Taiwan locale when using Apple speech models, which can degrade recognition for users expecting mainland/Simplified behavior. The migration should preserve neutrality (or infer from system Chinese locale) instead of hard-coding Traditional Chinese.

Useful? React with 👍 / 👎.

@stego-z
Copy link
Copy Markdown
Author

stego-z commented Apr 5, 2026 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants