feat: support voice.ai by zhangmo8 · Pull Request #1291 · ThinkInAIXYZ/deepchat

zhangmo8 · 2026-01-30T05:30:50Z

Summary by CodeRabbit

New Features
- Voice.ai Text-to-Speech provider with configurable audio format, language, model, voice and sampling controls
- Voice call widget integrated into chat for real-time voice interactions
- Audio message blocks and an in-message audio player for playback in conversations
Localization
- Added audio playback and voice-call UI translations and settings strings across multiple locales (en, zh-CN, da, fa, fr, he, ja, ko, pt-BR, ru, zh-HK, zh-TW)

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2026-01-30T05:31:05Z

📝 Walkthrough

Walkthrough

Adds Voice.ai Text-to-Speech support: a new VoiceAIProvider with streaming TTS, UI for provider configuration and voice calls, audio message handling and player, provider registration, i18n entries, and build/test config updates for custom Vue elements.

Changes

Cohort / File(s)	Summary
Voice.ai Provider Implementation `src/main/presenter/configPresenter/providers.ts`, `src/main/presenter/llmProviderPresenter/managers/providerInstanceManager.ts`, `src/main/presenter/llmProviderPresenter/providers/voiceAIProvider.ts`	New `VoiceAIProvider` added and registered (id/type), implements TTS generation, voice listing, proxy support, streaming audio events, audio parsing, and error handling.
Provider Configuration UI & Store `src/renderer/settings/components/ModelProviderSettingsDetail.vue`, `src/renderer/settings/components/VoiceAIProviderConfig.vue`, `src/renderer/src/stores/providerStore.ts`	New `VoiceAIProviderConfig` component and store methods (`getVoiceAIConfig`, `updateVoiceAIConfig`, `voiceAIConfig` ref) for managing TTS settings (format, language, model, agentId, temperature, top_p).
Chat UI & Message Rendering `src/renderer/src/components/chat-input/ChatInput.vue`, `src/renderer/src/components/chat-input/VoiceCallWidget.vue`, `src/renderer/src/components/message/MessageBlockAudio.vue`, `src/renderer/src/components/message/MessageItemAssistant.vue`, `src/renderer/src/stores/chat.ts`, `src/renderer/src/components/icons/ModelIcon.vue`	Integrates `VoiceCallWidget` into chat input; introduces `MessageBlockAudio` and audio-block detection/rendering; chat store emits audio blocks; adds voiceai icon mapping.
Type System Updates `src/shared/chat.d.ts`, `src/shared/types/core/chat.ts`	Adds `'audio'` to `AssistantMessageBlock.type` union to represent audio blocks.
Build & Test Config `electron.vite.config.ts`, `vitest.config.ts`, `vitest.config.renderer.ts`	Adds `isCustomElement` helper to recognize `voice-agent-widget` and `ui-resource-renderer*` tags; wires it into Vue plugin compilerOptions for Vite/Vitest projects.
Settings Dialog Logic `src/renderer/src/components/settings/ModelConfigDialog.vue`	Excludes `voiceai` from OpenAI-compatible provider checks (affects visibility of OpenAI-specific UI).
Localization `src/renderer/src/i18n/.../chat.json`, `src/renderer/src/i18n/.../settings.json`	Adds `audio` (play, pause) and `call` keys across locales; adds `provider.voiceai` settings labels/placeholders/helpers across many locale files for TTS UI. (Multiple locale files changed.)

Sequence Diagram(s)

sequenceDiagram
    participant UI as ChatInput / VoiceCallWidget
    participant Provider as VoiceAIProvider
    participant API as Voice.ai API
    participant Store as Chat Store / Renderer

    UI->>Provider: request speech (coreStream / generateSpeech)
    Provider->>Provider: extract user text, build TTS request
    Provider->>API: POST /api/tts (headers, body, proxy if set)
    API-->>Provider: audio response (base64 / data / URL)
    Provider->>Store: emit streaming event (imageData / audio)
    Store->>Store: create AssistantMessageBlock type:'audio'
    Store->>UI: render audio block
    UI->>UI: MessageBlockAudio plays audio

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

test: add vitest #509 — Related changes to Vitest/Vite Vue plugin custom element handling (both modify test/build configs for custom elements).
feat: messageList access virtual scrolling #1235 — Related message rendering and chat store changes (audio block handling and renderer flow overlap).
feat: add vertex support #1145 — Related provider registration pattern (adds a new LLM provider and updates provider registries similarly).

Suggested labels

codex

Suggested reviewers

deepinfect
zerob13

Poem

🐇 I tapped the keys and gave a call,

voice waves hopping through the hall,
audio carrots, bytes that sing,
the widget hums — what joy it brings,
the rabbit claps, a small byte ball. 🎶

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'feat: support voice.ai' accurately summarizes the main objective of the changeset, which comprehensively adds Voice.ai TTS provider integration across the entire codebase.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feat/voiceai

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 5

🤖 Fix all issues with AI agents

In `@src/main/presenter/llmProviderPresenter/providers/voiceAIProvider.ts`:
- Around line 91-141: The methods completions, summaries, and generateText call
generateSpeech() but discard its {audioBase64, mimeType} result and only return
an LLMResponse with text; remove the unnecessary generateSpeech() calls to avoid
wasteful TTS API usage. Update the implementations of completions, summaries,
and generateText in voiceAIProvider.ts to only validate input, skip calling
generateSpeech(), and return the appropriate text (e.g., text,
buildShortTitle(text), or prompt) as before; keep generateSpeech() and
coreStream() for the streaming/audio paths if needed.

In `@src/renderer/src/components/message/MessageBlockAudio.vue`:
- Line 64: audioError is initialized but never updated, so the template error
message never appears; wire the component's audio element error handling to set
audioError.value = true on failure and clear it on successful load. Add an error
event listener on the audio DOM ref (or set `@error` on the <audio> element) to
set audioError.value = true, set audioError.value = false in the audio load/play
success handlers (e.g., canplay, play or your existing load/play functions), and
remove listeners in onBeforeUnmount; locate the const audioError = ref(false)
and the audio element/ref handlers to implement these changes.

In `@src/renderer/src/i18n/en-US/settings.json`:
- Around line 496-527: Add the missing "voiceai" translations (same keys as in
en-US: voiceai.title, voiceai.description, voiceai.audioFormat.label,
voiceai.audioFormat.placeholder, voiceai.audioFormat.helper,
voiceai.language.label, voiceai.language.placeholder, voiceai.language.helper,
voiceai.model.label, voiceai.model.placeholder, voiceai.model.helper,
voiceai.agentId.label, voiceai.agentId.placeholder, voiceai.agentId.helper,
voiceai.temperature.label, voiceai.temperature.helper, voiceai.topP.label,
voiceai.topP.helper) into each of the remaining locale JSON files (da-DK, fa-IR,
fr-FR, he-IL, ja-JP, ko-KR, pt-BR, ru-RU, zh-HK, zh-TW) using the en-US values
as the baseline, preserving exact key names and JSON structure so all 12 locales
contain an identical "voiceai" section.

In `@src/renderer/src/i18n/zh-CN/chat.json`:
- Around line 101-111: The new translation keys audio.play, audio.pause and
call.title, call.description, call.start, call.stop, call.end were added to
zh-CN/chat.json but are missing from the other locale chat files; add those
exact keys (with appropriate translated strings) to each of the remaining locale
files: da-DK, fa-IR, fr-FR, he-IL, ja-JP, ko-KR, pt-BR, ru-RU, zh-HK and zh-TW
so all 12 locales have the same key structure under the "audio" and "call"
objects (preserve key names and JSON structure).

In `@src/renderer/src/i18n/zh-CN/settings.json`:
- Around line 397-428: Add the missing provider.voiceai translation block (same
key names as the existing "voiceai" object: title, description,
audioFormat.{label,placeholder,helper}, language.{label,placeholder,helper},
model.{label,placeholder,helper}, agentId.{label,placeholder,helper},
temperature.{label,helper}, topP.{label,helper}) to each of the 10 locale files:
da-DK, fa-IR, fr-FR, he-IL, ja-JP, ko-KR, pt-BR, ru-RU, zh-HK, zh-TW, keeping
the exact structure used in the current zh-CN snippet and providing appropriate
localized strings for each field (or accurate fallbacks if a translation is
unavailable). Ensure keys are placed under the same parent namespace
(provider.voiceai or top-level "voiceai" depending on project convention), match
spelling/casing of existing keys (e.g., "audioFormat", "agentId"), and run the
linter/locale validation to confirm no missing or extra keys.

🧹 Nitpick comments (9)

electron.vite.config.ts (1)
10-11: Consider extracting isCustomElement to a shared location.

The isCustomElement helper is duplicated in vitest.config.renderer.ts. If more custom elements are added later, both files need updates.
♻️ Optional: Extract to shared config

Create a shared file (e.g., config/vue-custom-elements.ts):
export const isCustomElement = (tag: string) =>
  tag === 'voice-agent-widget' || tag.startsWith('ui-resource-renderer')
Then import in both config files.
src/renderer/src/components/message/MessageItemAssistant.vue (1)
173-187: Potential false positive in audio extension detection.

The check lower.includes(ext) on line 184 may produce false positives. For example, a URL like https://example.com/mp3player/image.jpg would incorrectly match .mp3 because the string contains "mp3".

Consider using a more precise check that verifies the extension appears at the end of the URL path:
♻️ Proposed fix for more precise extension matching
 const isAudioBlock = (block: AssistantMessageBlock): boolean => {
   if (block.type === 'audio') return true
   if (block.type !== 'image') return false
   const mimeType = block.image_data?.mimeType?.toLowerCase() || ''
   if (mimeType.startsWith('audio/')) return true
   const data = block.image_data?.data || ''
   if (data.startsWith('data:audio/')) return true
   if (data.startsWith('imgcache://') || data.startsWith('http://') || data.startsWith('https://')) {
     const lower = data.toLowerCase()
-    return AUDIO_EXTENSIONS.some((ext) => lower.includes(ext))
+    // Extract path and check if it ends with an audio extension
+    try {
+      const url = new URL(lower.replace('imgcache://', 'https://'))
+      const pathname = url.pathname
+      return AUDIO_EXTENSIONS.some((ext) => pathname.endsWith(ext))
+    } catch {
+      return AUDIO_EXTENSIONS.some((ext) => lower.endsWith(ext))
+    }
   }
   return false
 }
src/renderer/settings/components/VoiceAIProviderConfig.vue (2)
158-160: Define TypeScript interface for component props.

The props are correctly typed with a TypeScript interface inline. However, consider defining the interface explicitly as per coding guidelines.
♻️ Suggested improvement
+interface Props {
+  provider: LLM_PROVIDER
+}
+
-defineProps<{
-  provider: LLM_PROVIDER
-}>()
+defineProps<Props>()
216-248: Consider consolidating individual watchers.

While the current approach with individual watchers is clear, you could use a single watcher with an object to reduce repetition. This is optional since the current implementation is readable and maintainable.
src/renderer/src/components/chat-input/VoiceCallWidget.vue (2)
79-88: Consider removing unused refs.

The callWidgetPulse ref (line 85) and voiceWidgetReady ref (line 83) are updated but never read in the template or logic. Similarly, voiceWidgetLoading (line 84) and voiceWidget (line 82) are declared but not used meaningfully.
♻️ Potential cleanup

If these refs are intended for future use, consider adding a TODO comment. Otherwise, remove them to reduce dead code:
-const voiceWidget = ref<HTMLElement | null>(null)
-const voiceWidgetReady = ref(false)
-const voiceWidgetLoading = ref(false)
-const callWidgetPulse = ref(false)
-let callWidgetPulseTimer: ReturnType<typeof setTimeout> | null = null
The voiceWidgetScriptPromise is correctly used for deduplicating script load requests.
205-209: Timer cleanup on unmount is correct but incomplete.

The callWidgetPulseTimer cleanup is good, but callWidgetPulseTimer is never actually set anywhere in the code. If this timer was removed, the cleanup can also be removed.
src/renderer/src/components/message/MessageBlockAudio.vue (1)
44-51: Unusual i18n fallback pattern - consider alignment with project conventions.

The try/catch wrapper around useI18n() is atypical. If this component is intended to work outside the i18n context, consider using a more explicit approach or ensure the component is always rendered within a proper i18n provider.
♻️ Alternative approach

If fallback is intentional for edge cases:
import { inject } from 'vue'

// Use inject to check for i18n availability more gracefully
const i18n = inject('i18n', null)
const t = i18n ? useI18n().t : (key: string) => keyMap[key] || key
Or simply ensure this component is always used within a proper Vue app context with i18n configured.
src/main/presenter/llmProviderPresenter/providers/voiceAIProvider.ts (2)

185-197: Consider clarifying contextLength and maxTokens for TTS models.

These fields (contextLength: 4096, maxTokens: 2048) are likely required by the MODEL_META interface but are not meaningful for TTS providers. Consider adding a comment explaining these are placeholder values to satisfy the interface, or using more appropriate values (e.g., character limits for TTS input).

161-164: Consider renaming the event to reflect its broader media purpose, or create a dedicated audioData event.

The code uses createStreamEvent.imageData() to emit audio data. While functionally correct due to generic field names (data, mimeType), the naming suggests the event is exclusively for images. Either rename ImageDataStreamEvent to MediaDataStreamEvent or BinaryDataStreamEvent to clarify it handles multiple media types, or add a dedicated audioData event type to the event interface for semantic clarity.

coderabbitai · 2026-01-30T05:35:46Z

src/main/presenter/llmProviderPresenter/providers/voiceAIProvider.ts

+  public async completions(
+    messages: ChatMessage[],
+    modelId: string,
+    temperature?: number,
+    _maxTokens?: number
+  ): Promise<LLMResponse> {
+    const text = this.extractLatestUserText(messages)
+    if (!text) {
+      throw new Error('No user text provided for Voice.ai TTS')
+    }
+
+    await this.generateSpeech(text, modelId, temperature)
+
+    return {
+      content: text
+    }
+  }
+
+  public async summaries(
+    text: string,
+    modelId: string,
+    temperature?: number,
+    _maxTokens?: number
+  ): Promise<LLMResponse> {
+    if (!text) {
+      throw new Error('No text provided for Voice.ai TTS')
+    }
+
+    await this.generateSpeech(text, modelId, temperature)
+
+    return {
+      content: this.buildShortTitle(text)
+    }
+  }
+
+  public async generateText(
+    prompt: string,
+    modelId: string,
+    temperature?: number,
+    _maxTokens?: number
+  ): Promise<LLMResponse> {
+    if (!prompt) {
+      throw new Error('No prompt provided for Voice.ai TTS')
+    }
+
+    await this.generateSpeech(prompt, modelId, temperature)
+
+    return {
+      content: prompt
+    }
+  }


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Search for LLMResponse type definition rg "type\s+LLMResponse\s*=" --type=ts -A 5 rg "interface\s+LLMResponse" --type=ts -A 5

Repository: ThinkInAIXYZ/deepchat

Length of output: 529

🏁 Script executed:

#!/bin/bash # Find the voiceAIProvider file and check generateSpeech implementation find . -name "voiceAIProvider.ts" -type f

Repository: ThinkInAIXYZ/deepchat

Length of output: 135

🏁 Script executed:

#!/bin/bash # Search for generateSpeech method definition rg "generateSpeech" --type=ts -B 2 -A 10 src/main/presenter/llmProviderPresenter/providers/

Repository: ThinkInAIXYZ/deepchat

Length of output: 6033

🏁 Script executed:

#!/bin/bash # Check other LLM provider implementations to understand the pattern ls -la src/main/presenter/llmProviderPresenter/providers/

Repository: ThinkInAIXYZ/deepchat

Length of output: 2551

🏁 Script executed:

#!/bin/bash # Check if LLMResponse type extends or if there are variants rg "LLMResponse" --type=ts src/shared/types/ -B 2 -A 8

Repository: ThinkInAIXYZ/deepchat

Length of output: 1655

🏁 Script executed:

#!/bin/bash # Check base class or interface that voiceAIProvider implements rg "class VoiceAIProvider\|extends\|implements" --type=ts src/main/presenter/llmProviderPresenter/providers/voiceAIProvider.ts -A 2

Repository: ThinkInAIXYZ/deepchat

Length of output: 47

🏁 Script executed:

#!/bin/bash # Check voiceAIProvider class definition head -20 src/main/presenter/llmProviderPresenter/providers/voiceAIProvider.ts

Repository: ThinkInAIXYZ/deepchat

Length of output: 631

🏁 Script executed:

#!/bin/bash # Check if these methods are required by an interface or base class rg "completions\|summaries\|generateText" --type=ts src/main/presenter/llmProviderPresenter/providers/voiceAIProvider.ts -B 1 | head -30

Repository: ThinkInAIXYZ/deepchat

Length of output: 47

🏁 Script executed:

#!/bin/bash # Check BaseLLMProvider to see what methods are required rg "class BaseLLMProvider|abstract\|completions\|summaries\|generateText" --type=ts src/main/presenter/llmProviderPresenter/baseProvider.ts -B 1 -A 5

Repository: ThinkInAIXYZ/deepchat

Length of output: 365

🏁 Script executed:

#!/bin/bash # Look for how non-stream methods work in other providers rg "public async completions" --type=ts src/main/presenter/llmProviderPresenter/providers/ -A 8 | head -50

Repository: ThinkInAIXYZ/deepchat

Length of output: 2750

🏁 Script executed:

#!/bin/bash # Check if voiceAIProvider should even have these non-streaming methods # Look at summaries and generateText in other providers rg "public async summaries|public async generateText" --type=ts src/main/presenter/llmProviderPresenter/providers/ -l

Repository: ThinkInAIXYZ/deepchat

Length of output: 266

🏁 Script executed:

#!/bin/bash # Check the full voiceAIProvider to understand design intent wc -l src/main/presenter/llmProviderPresenter/providers/voiceAIProvider.ts cat -n src/main/presenter/llmProviderPresenter/providers/voiceAIProvider.ts | head -100

Repository: ThinkInAIXYZ/deepchat

Length of output: 3224

🏁 Script executed:

#!/bin/bash # Check if these methods are abstract/required in BaseLLMProvider ast-grep --pattern $'abstract class BaseLLMProvider { $$$ abstract completions($_, $_): $_ $$$ }' rg "abstract.*completions|abstract.*summaries|abstract.*generateText" --type=ts src/main/presenter/llmProviderPresenter/baseProvider.ts

Repository: ThinkInAIXYZ/deepchat

Length of output: 135

🏁 Script executed:

#!/bin/bash # Check the entirety of voiceAIProvider to see full implementation intent sed -n '91,165p' src/main/presenter/llmProviderPresenter/providers/voiceAIProvider.ts

Repository: ThinkInAIXYZ/deepchat

Length of output: 1807

Remove unnecessary generateSpeech() calls from completions, summaries, and generateText methods.

These methods call generateSpeech() but discard the returned audio data { audioBase64, mimeType }, then return only text content. The LLMResponse type cannot hold audio data—only coreStream() properly delivers audio via streaming events. These methods are abstract requirements from BaseLLMProvider but should not generate speech if they cannot return it. Either remove the generateSpeech() calls to avoid wasteful API calls, or reconsider whether this TTS provider should implement these text-only methods.

🤖 Prompt for AI Agents

In `@src/main/presenter/llmProviderPresenter/providers/voiceAIProvider.ts` around lines 91 - 141, The methods completions, summaries, and generateText call generateSpeech() but discard its {audioBase64, mimeType} result and only return an LLMResponse with text; remove the unnecessary generateSpeech() calls to avoid wasteful TTS API usage. Update the implementations of completions, summaries, and generateText in voiceAIProvider.ts to only validate input, skip calling generateSpeech(), and return the appropriate text (e.g., text, buildShortTitle(text), or prompt) as before; keep generateSpeech() and coreStream() for the streaming/audio paths if needed.

coderabbitai · 2026-01-30T05:35:46Z

src/renderer/src/components/message/MessageBlockAudio.vue

+  mimeType?: string
+}
+
+const audioError = ref(false)


⚠️ Potential issue | 🟡 Minor

audioError ref is declared but never set.

The audioError ref is used in the template (line 19) to display an error message, but it's never set to true anywhere in the component logic. This means the error message will never be displayed.

🐛 Suggested fix - add error handling to audio element

Either remove the unused error handling, or implement it properly:

+const handleAudioError = () => { + audioError.value = true +}

And in the template:

-<audio :src="audioSrc" controls class="w-full" /> +<audio :src="audioSrc" controls class="w-full" `@error`="handleAudioError" />

🤖 Prompt for AI Agents

In `@src/renderer/src/components/message/MessageBlockAudio.vue` at line 64, audioError is initialized but never updated, so the template error message never appears; wire the component's audio element error handling to set audioError.value = true on failure and clear it on successful load. Add an error event listener on the audio DOM ref (or set `@error` on the <audio> element) to set audioError.value = true, set audioError.value = false in the audio load/play success handlers (e.g., canplay, play or your existing load/play functions), and remove listeners in onBeforeUnmount; locate the const audioError = ref(false) and the audio element/ref handlers to implement these changes.

coderabbitai · 2026-01-30T05:35:46Z

src/renderer/src/i18n/en-US/settings.json

+    "voiceai": {
+      "title": "Voice.ai Text-to-Speech",
+      "description": "Generate speech from text. Voices appear in the model list below.",
+      "audioFormat": {
+        "label": "Audio Format",
+        "placeholder": "Select format",
+        "helper": "MP3 is recommended for most cases."
+      },
+      "language": {
+        "label": "Language",
+        "placeholder": "e.g. en",
+        "helper": "Supported: en, ca, sv, es, fr, de, it, pt, pl, ru, nl."
+      },
+      "model": {
+        "label": "TTS Model",
+        "placeholder": "voiceai-tts-v1-latest",
+        "helper": "See the Voice.ai docs for supported models."
+      },
+      "agentId": {
+        "label": "Voice Agent ID",
+        "placeholder": "Paste voice agent ID",
+        "helper": "Required for real-time voice calls."
+      },
+      "temperature": {
+        "label": "Temperature",
+        "helper": "Controls randomness (0-2)."
+      },
+      "topP": {
+        "label": "Top P",
+        "helper": "Nucleus sampling (0-1)."
+      }
+    },


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Verify voiceai keys exist in all locale files echo "Checking voiceai keys in all language files:" echo "" for locale in da-DK en-US fa-IR fr-FR he-IL ja-JP ko-KR pt-BR ru-RU zh-CN zh-HK zh-TW; do file="src/renderer/src/i18n/$locale/settings.json" if [ -f "$file" ]; then if grep -q '"voiceai"' "$file"; then echo "✓ $locale: voiceai keys present" else echo "✗ $locale: voiceai keys MISSING" fi else echo "? $locale: file not found" fi done

Repository: ThinkInAIXYZ/deepchat

Length of output: 470

Add voiceai translations to all remaining language files.

The voiceai section is only present in en-US and zh-CN. According to i18n guidelines, translations must be added to ALL 12 language files: da-DK, fa-IR, fr-FR, he-IL, ja-JP, ko-KR, pt-BR, ru-RU, zh-HK, and zh-TW. Use en-US as the reference and maintain consistent key names across all locales.

🤖 Prompt for AI Agents

In `@src/renderer/src/i18n/en-US/settings.json` around lines 496 - 527, Add the missing "voiceai" translations (same keys as in en-US: voiceai.title, voiceai.description, voiceai.audioFormat.label, voiceai.audioFormat.placeholder, voiceai.audioFormat.helper, voiceai.language.label, voiceai.language.placeholder, voiceai.language.helper, voiceai.model.label, voiceai.model.placeholder, voiceai.model.helper, voiceai.agentId.label, voiceai.agentId.placeholder, voiceai.agentId.helper, voiceai.temperature.label, voiceai.temperature.helper, voiceai.topP.label, voiceai.topP.helper) into each of the remaining locale JSON files (da-DK, fa-IR, fr-FR, he-IL, ja-JP, ko-KR, pt-BR, ru-RU, zh-HK, zh-TW) using the en-US values as the baseline, preserving exact key names and JSON structure so all 12 locales contain an identical "voiceai" section.

coderabbitai · 2026-01-30T05:35:46Z

src/renderer/src/i18n/zh-CN/chat.json

+  "audio": {
+    "play": "播放",
+    "pause": "暂停"
+  },
+  "call": {
+    "title": "Voice.ai 通话",
+    "description": "点击连接 Voice.ai 语音代理，接通电话后说话即可。",
+    "start": "开始语音通话",
+    "stop": "结束通话",
+    "end": "挂断"
+  },


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Check which locale chat.json files have the audio and call keys locales=(da-DK en-US fa-IR fr-FR he-IL ja-JP ko-KR pt-BR ru-RU zh-CN zh-HK zh-TW) echo "=== Checking for 'audio' and 'call' keys in chat.json files ===" echo "" for locale in "${locales[@]}"; do file="src/renderer/src/i18n/$locale/chat.json" if [ -f "$file" ]; then has_audio=$(grep -q '"audio"' "$file" && echo "✓" || echo "✗") has_call=$(grep -q '"call"' "$file" && echo "✓" || echo "✗") echo "$locale: audio=$has_audio, call=$has_call" else echo "$locale: FILE NOT FOUND" fi done

Repository: ThinkInAIXYZ/deepchat

Length of output: 405

Add audio and call translations to all remaining locale files.

The new keys added to zh-CN/chat.json are missing from 10 other locale files (da-DK, fa-IR, fr-FR, he-IL, ja-JP, ko-KR, pt-BR, ru-RU, zh-HK, zh-TW). Per guidelines, all new translations must be added to all 12 language files with consistent key names.

🤖 Prompt for AI Agents

In `@src/renderer/src/i18n/zh-CN/chat.json` around lines 101 - 111, The new translation keys audio.play, audio.pause and call.title, call.description, call.start, call.stop, call.end were added to zh-CN/chat.json but are missing from the other locale chat files; add those exact keys (with appropriate translated strings) to each of the remaining locale files: da-DK, fa-IR, fr-FR, he-IL, ja-JP, ko-KR, pt-BR, ru-RU, zh-HK and zh-TW so all 12 locales have the same key structure under the "audio" and "call" objects (preserve key names and JSON structure).

coderabbitai · 2026-01-30T05:35:46Z

src/renderer/src/i18n/zh-CN/settings.json

+    "voiceai": {
+      "title": "Voice.ai 文字转语音",
+      "description": "将文本生成语音，voice 会在下方模型列表中展示。",
+      "audioFormat": {
+        "label": "音频格式",
+        "placeholder": "选择格式",
+        "helper": "多数场景推荐 MP3。"
+      },
+      "language": {
+        "label": "语言",
+        "placeholder": "例如 en",
+        "helper": "支持语言：en, ca, sv, es, fr, de, it, pt, pl, ru, nl。"
+      },
+      "model": {
+        "label": "TTS 模型",
+        "placeholder": "voiceai-tts-v1-latest",
+        "helper": "可选模型请查看 Voice.ai 文档。"
+      },
+      "agentId": {
+        "label": "语音 Agent ID",
+        "placeholder": "粘贴 voice agent id",
+        "helper": "实时通话需要此 ID。"
+      },
+      "temperature": {
+        "label": "温度",
+        "helper": "随机性参数（0-2）。"
+      },
+      "topP": {
+        "label": "Top P",
+        "helper": "Nucleus 采样（0-1）。"
+      }
+    },


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Check if voiceai translation keys exist in all locale files echo "=== Checking voiceai translation keys across all locales ===" for locale in da-DK en-US fa-IR fr-FR he-IL ja-JP ko-KR pt-BR ru-RU zh-CN zh-HK zh-TW; do settings_file="src/renderer/src/i18n/${locale}/settings.json" if [ -f "$settings_file" ]; then if grep -q '"voiceai"' "$settings_file"; then echo "✓ $locale: voiceai keys present" else echo "✗ $locale: voiceai keys MISSING" fi else echo "? $locale: settings.json not found" fi done echo "" echo "=== Listing all language directories ===" ls -la src/renderer/src/i18n/ | grep "^d" | awk '{print $NF}' | grep -v "^\."

Repository: ThinkInAIXYZ/deepchat

Length of output: 599

Add voiceai translations to 10 missing language files.

The voiceai translation keys are only present in en-US and zh-CN. Per coding guidelines, translations must exist in ALL 12 language files. Add the provider.voiceai.* keys to:

da-DK, fa-IR, fr-FR, he-IL, ja-JP, ko-KR, pt-BR, ru-RU, zh-HK, zh-TW

Each locale requires the same key structure with appropriate language translations.

🤖 Prompt for AI Agents

In `@src/renderer/src/i18n/zh-CN/settings.json` around lines 397 - 428, Add the missing provider.voiceai translation block (same key names as the existing "voiceai" object: title, description, audioFormat.{label,placeholder,helper}, language.{label,placeholder,helper}, model.{label,placeholder,helper}, agentId.{label,placeholder,helper}, temperature.{label,helper}, topP.{label,helper}) to each of the 10 locale files: da-DK, fa-IR, fr-FR, he-IL, ja-JP, ko-KR, pt-BR, ru-RU, zh-HK, zh-TW, keeping the exact structure used in the current zh-CN snippet and providing appropriate localized strings for each field (or accurate fallbacks if a translation is unavailable). Ensure keys are placed under the same parent namespace (provider.voiceai or top-level "voiceai" depending on project convention), match spelling/casing of existing keys (e.g., "audioFormat", "agentId"), and run the linter/locale validation to confirm no missing or extra keys.

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@src/renderer/src/i18n/ko-KR/settings.json`:
- Around line 329-360: Add the missing provider.voiceai block (with keys:
voiceai.title, voiceai.description,
voiceai.audioFormat.label/placeholder/helper,
voiceai.language.label/placeholder/helper,
voiceai.model.label/placeholder/helper,
voiceai.agentId.label/placeholder/helper, voiceai.temperature.label/helper, and
voiceai.topP.label/helper) to every locale file (da-DK, en-US, fa-IR, fr-FR,
he-IL, ja-JP, ko-KR, pt-BR, ru-RU, zh-CN, zh-HK, zh-TW) using identical key
names and structure as shown in the ko-KR diff so all locales contain the same
keys and no runtime strings are missing.

🧹 Nitpick comments (2)

src/renderer/src/i18n/zh-HK/settings.json (1)
343-351: Consider using「智能體」instead of “Agent” in zh-HK labels/placeholders.

This aligns with the Chinese-locale guideline for “Agents” terminology.
♻️ Suggested wording tweak
-      "agentId": {
-        "label": "語音 Agent ID",
-        "placeholder": "貼上 voice agent id",
-        "helper": "即時通話需要此 ID。"
-      },
+      "agentId": {
+        "label": "語音智能體 ID",
+        "placeholder": "貼上語音智能體 ID",
+        "helper": "即時通話需要此 ID。"
+      },
Based on learnings: For Chinese translations, consider using '智能体' for 'Agents' instead of the English term.
src/renderer/src/i18n/pt-BR/settings.json (1)
348-351: Align the placeholder with the Portuguese label.

Minor polish: the placeholder can mirror the Portuguese wording used in the label.
📝 Suggested tweak
-      "agentId": {
-        "label": "ID do agente de voz",
-        "placeholder": "Cole o ID do voice agent",
-        "helper": "Necessário para chamadas de voz em tempo real."
-      },
+      "agentId": {
+        "label": "ID do agente de voz",
+        "placeholder": "Cole o ID do agente de voz",
+        "helper": "Necessário para chamadas de voz em tempo real."
+      },

coderabbitai · 2026-01-30T06:00:41Z

src/renderer/src/i18n/ko-KR/settings.json

+    "voiceai": {
+      "title": "Voice.ai 텍스트 음성 변환",
+      "description": "텍스트로부터 음성을 생성합니다. 음성은 아래 모델 목록에 표시됩니다.",
+      "audioFormat": {
+        "label": "오디오 형식",
+        "placeholder": "형식 선택",
+        "helper": "대부분의 경우 MP3를 권장합니다."
+      },
+      "language": {
+        "label": "언어",
+        "placeholder": "예: en",
+        "helper": "지원: en, ca, sv, es, fr, de, it, pt, pl, ru, nl."
+      },
+      "model": {
+        "label": "TTS 모델",
+        "placeholder": "voiceai-tts-v1-latest",
+        "helper": "지원되는 모델은 Voice.ai 문서를 참고하세요."
+      },
+      "agentId": {
+        "label": "음성 에이전트 ID",
+        "placeholder": "voice agent id 붙여넣기",
+        "helper": "실시간 음성 통화에 필요합니다."
+      },
+      "temperature": {
+        "label": "온도",
+        "helper": "무작위성을 제어합니다 (0-2)."
+      },
+      "topP": {
+        "label": "Top P",
+        "helper": "Nucleus 샘플링 (0-1)."
+      }
+    },


⚠️ Potential issue | 🟠 Major

Add the new provider.voiceai keys to all locale files.

The Voice.ai block appears only in ko-KR here; per i18n rules it must be added with identical keys to all supported locales (da-DK, en-US, fa-IR, fr-FR, he-IL, ja-JP, ko-KR, pt-BR, ru-RU, zh-CN, zh-HK, zh-TW) to avoid missing strings at runtime.

As per coding guidelines: “Add new translations to ALL language files (da-DK, en-US, fa-IR, fr-FR, he-IL, ja-JP, ko-KR, pt-BR, ru-RU, zh-CN, zh-HK, zh-TW) with consistent key names across all locales.”

🤖 Prompt for AI Agents

In `@src/renderer/src/i18n/ko-KR/settings.json` around lines 329 - 360, Add the missing provider.voiceai block (with keys: voiceai.title, voiceai.description, voiceai.audioFormat.label/placeholder/helper, voiceai.language.label/placeholder/helper, voiceai.model.label/placeholder/helper, voiceai.agentId.label/placeholder/helper, voiceai.temperature.label/helper, and voiceai.topP.label/helper) to every locale file (da-DK, en-US, fa-IR, fr-FR, he-IL, ja-JP, ko-KR, pt-BR, ru-RU, zh-CN, zh-HK, zh-TW) using identical key names and structure as shown in the ko-KR diff so all locales contain the same keys and no runtime strings are missing.

* feat(agent): add question tool flow * feat(mcp): implement real Apple Maps search using URL scheme (#1289) * feat: support voice.ai (#1291) * feat: remove custome tiptap (#1295) * feat: settings auto scroll toggle (#1293) * feat: settings auto scroll toggle * feat: i18n support * fix(renderer): remove specific event listeners instead of all * feat: add tooltip for filling default API URL in settings (#1296) * refactor(question): simplify question request UI to single-choice interface * fix(chat): restore pending question state * fix: review issues --------- Co-authored-by: Qi Jin <jin.qi1@northeastern.edu> Co-authored-by: xiaomo <wegi866@gmail.com>

* chore: typecheck with tsgo (#1278) * feat: image left align * feat: app settings skills (#1283) * docs: add spec for app-settings * feat: implement chat-driven settings control with skill gating Add a safe, validated API for modifying DeepChat application settings via natural language. Settings changes are controlled by a dedicated 'deepchat-settings' skill to ensure tools are only available when contextually relevant. Key features: - Skill-gated tool injection: settings tools only appear when deepchat-settings skill is active - Safe settings apply API with Zod validation and strict allowlist - Support for toggles (sound, copy COT, chat mode) and enums (language, theme, font size) - Defense-in-depth: runtime skill verification before applying changes - Deep-linked settings navigation for unsupported/complex settings - Comprehensive test coverage for validation, mapping, and skill gating Changes: - Add ChatSettingsToolHandler with validated apply/open methods - Integrate with AgentToolManager for tool definition gating - Create deepchat-settings built-in skill with clear activation rules - Add shared types for requests/responses (chatSettings.ts) - Implement settings window navigation with SECTION_ALIASES - Add unit tests for handler and integration tests for tool gating - Translate spec documents (plan.md, spec.md, tasks.md) to Chinese - Fix type errors in getCurrentValue and OPEN_SECTION_VALUES * refactor: remove chatMode from settings control and add permission service Remove chatMode setting from the allowlist as it requires conversation-scoped updates that are better handled separately. Add permission checking for settings window opening to provide user control over settings navigation. Key changes: - Remove setChatMode tool and related schemas from ChatSettingsToolHandler - Add SettingsPermissionService for managing tool approvals (one-time and session) - Add permission check for deepchat_settings_open tool - Update PermissionHandler to handle settings permission grants - Add rememberable flag to permission request structure - Update AgentToolManager to consume approvals before opening settings - Add settingsPermissionService to main presenter index - Clear settings approvals when conversation ends - Update spec documents to reflect removed chatMode feature - Remove chatMode-related tests and types This ensures settings window opening requires explicit user approval and provides a cleaner separation of concerns for chat mode management. * docs: translate to en * chore: yo browser less context (#1284) * docs: add specs for yo browesr context manager * feat(yo-browser): improve skill description and add yo_browser_cdp_send to offload whitelist * refactor(yobrowser): remove skill gating and make CDP tools always available in agent mode * refactor(yobrowser): add CDP method schema validation with strict enums - Add enum-based validation for cdp_send method (11 common CDP methods) - Add detailed union schemas for each method's parameters with examples - Add normalizeCdpParams method to handle both object and JSON string inputs - Prevent method typos and provide better type safety for CDP interactions * fix(yobrowser): add strict tab ID validation in CDP send handler * chore: update deps * refactor: agent provider simplification (#1286) * docs: add spec for agent provider * refactor(agent): remove BaseAgentProvider layer and simplify provider hierarchy * feat(mcp): implement real Apple Maps search using URL scheme (#1289) * feat: support voice.ai (#1291) * feat: remove custome tiptap (#1295) * feat: settings auto scroll toggle (#1293) * feat: settings auto scroll toggle * feat: i18n support * fix(renderer): remove specific event listeners instead of all * feat: add tooltip for filling default API URL in settings (#1296) * fix: fix model list refresh failed (#1297) * feat: add question tools (#1298) * feat(agent): add question tool flow * feat(mcp): implement real Apple Maps search using URL scheme (#1289) * feat: support voice.ai (#1291) * feat: remove custome tiptap (#1295) * feat: settings auto scroll toggle (#1293) * feat: settings auto scroll toggle * feat: i18n support * fix(renderer): remove specific event listeners instead of all * feat: add tooltip for filling default API URL in settings (#1296) * refactor(question): simplify question request UI to single-choice interface * fix(chat): restore pending question state * fix: review issues --------- Co-authored-by: Qi Jin <jin.qi1@northeastern.edu> Co-authored-by: xiaomo <wegi866@gmail.com> * chore: integrated vue-tsgo (#1299) * chore: integrated vue-tsgo * chore: update recommendation exteion * chore: update * feat: add prompt for ask tool * chore: version to 0.5.7 --------- Co-authored-by: xiaomo <wegi866@gmail.com> Co-authored-by: wanna <wanna.w@binarywalk.com> Co-authored-by: Qi Jin <jin.qi1@northeastern.edu>

zhangmo8 added 3 commits January 20, 2026 12:31

feat: voice ai text to speech

9f6e1a9

feat: voice ai call phone

cf5af65

feat:Encapsulate components for subsequent use

ec5ac93

coderabbitai bot reviewed Jan 30, 2026

View reviewed changes

chore: update i18n

47f3110

coderabbitai bot reviewed Jan 30, 2026

View reviewed changes

zhangmo8 merged commit f31c9e6 into dev Jan 30, 2026
2 checks passed

zerob13 pushed a commit that referenced this pull request Feb 4, 2026

feat: support voice.ai (#1291)

2e0f85d

coderabbitai bot mentioned this pull request Feb 4, 2026

feat: add question tools #1298

Merged

zhangmo8 deleted the feat/voiceai branch February 5, 2026 02:39

Conversation

zhangmo8 commented Jan 30, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

zhangmo8 commented Jan 30, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 30, 2026 •

edited

Loading