Skip to content

Conversation

@Prajna1999
Copy link
Collaborator

@Prajna1999 Prajna1999 commented Jan 21, 2026

Summary

Target issue is #515 and #556
Extend config management and Unified API endpoints for audio (STT and ASR) use cases.

Checklist

Before submitting a pull request, please ensure that you mark these task.

  • Ran fastapi run --reload app/main.py or docker compose up in the repository root and test.
  • If you've fixed a bug or added code that is tested and has test cases.

Notes

  1. Configuration Changes

Revised configuration schema with config version could look something like this. Based on config_blob_completion_type: “text” | “tts” |”stt”, three different completion objects would need to be passed.
Modifications to schemas are highlighted.

// Text configuration

{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  “name”:”Openai Test Configuration”,
  “description”:”Lorem ipsum dolor sit amet”,
  "version": 1,
  "config_blob": {
    "completion": {
      "provider": "openai" | “google” | “anthropic”,
      **"type": "text",**
      "params": {
        "model": "gpt-4o",
        "instructions": "You are a helpful assistant",
        "knowledge_base_ids": ["vs_123", "vs_456"],
        "reasoning": "low" | "medium" | "high",
        "temperature": 0.7,
         “max_num_results:20,
        "provider_specific": {

	"openai": {},
          "google": {}

}
      }
    },

// for future extensions  
  "classifier": {},
  "input_guardrails": {},
  "output_guardrails": {}

  }
}

// STT Configuration
{
  "id": "550e8400-e29b-41d4-a716-446655440001",
 “name”:”Gemini Configuration”,
  “description”:”Lorem ipsum dolor sit amet”,

  "version": 2,
  "config_blob": {
    "completion": {
      "provider": "google",
      "**type": "stt"**,
      "params": {
        "model": "gemini-2.5-pro",
        "instruction": "Transcribe the audio verbatim",
        **"input_language": "hi",**
       ** "output_language": "en", **(#transcription in a single step)
        **"response_format": "text"| “json”,**
        "temperature": 0.7,
        "provider_specific": {
          "openai": {},
          "google": {}
        }
      }
    },
    "classifier": {},
    "input_guardrails": {},
    "output_guardrails": {}


  }
}

// TTS Configuration 
{
  "id": "550e8400-e29b-41d4-a716-446655440002",
  "version": "1.0.0",
  "config_blob": {
    "completion": {
      "provider": "google",
      "type": "tts",
      "params": {
        "model": "gemini-2.5-pro-tts",
       ** "voice": "alloy",** (supported provider acccent)
        **"language": "en-US",** 
        **"response_format": "mp3" | “wav” | “ogg”,** (ogg works better in Android devices)
       ** "speed": 1.0,**
        "provider_specific": {
          "openai": {},
          "gemini": {
    "director_notes": "Speak with a professional, calm tone. Pause for 1 second between sentences.",
        
"response_modalities": ["AUDIO"] # example metadata for tracing 

}
        }
      }
    },
    

    "classifier": {},
    "input_guardrails": {},
    "output_guardrails": {}

  }
}

  1. Extended llm/call endpoint for Google Gemini provider. It takes a public link or base64 encoded string to the input field for audio files.
  2. Also added a combined llm_call table storing essential metadata for llm_call request and response.
  3. Experimental automatic speech recognition using auto flag in the STT endpoint.

Test cases for mappers have been supressed because the usual behaviour created inconsistency for provider.type=google/openai and provider.type=google-native/openai-native.

Also most file changes are auto created while fixing formatting. The real change files are few.

Summary by CodeRabbit

  • New Features

    • Google Gemini provider for LLM operations
    • LLM call lifecycle: create, update, and query call records
    • Partial config updates: submit partial blobs that merge with existing versions
    • Support for text, base64-audio, and audio-URL inputs
  • Improvements

    • Enforced immutable config fields (prevents type changes) and inherited fields on updates
    • Unified provider/client initialization and improved parameter mapping
    • Better input resolution, error handling, and temp-file cleanup
  • Migrations / Tests

    • DB migration for LLM call table and expanded test coverage

✏️ Tip: You can customize this high-level summary in your review settings.

@Prajna1999 Prajna1999 self-assigned this Jan 21, 2026
@coderabbitai
Copy link

coderabbitai bot commented Jan 21, 2026

📝 Walkthrough

Walkthrough

Adds Google (Gemini) provider support and STT flow, LLM call recording and CRUD, input resolution for text/audio, partial config-version creation with deep-merge and immutable type checks, expanded LLM models/types, tests, and DB migrations for llm_call.

Changes

Cohort / File(s) Summary
Config Version Management
backend/app/api/routes/config/version.py, backend/app/crud/config/version.py, backend/app/models/config/version.py, backend/app/models/config/__init__.py
Route now accepts ConfigVersionCreatePartial; added create_from_partial_or_raise, _get_latest_version, _deep_merge, and immutable type validation to merge partial blobs into a new version.
Google AI Provider & Registry
backend/app/core/providers.py, backend/app/services/llm/providers/gai.py, backend/app/services/llm/providers/oai.py, backend/app/services/llm/providers/registry.py, backend/app/services/llm/providers/__init__.py
Added Provider.GOOGLE, new GoogleAIProvider implementing Gemini STT, unified create_client pattern, updated provider registry and imports, and updated execute signatures to accept resolved input.
LLM Call Tracking & CRUD
backend/app/crud/llm.py, backend/app/alembic/versions/042_add_llm_call_table.py, backend/app/alembic/versions/044_remove_enum_checks_llm_call_provider.py
New llm_call migration and CRUD surface: create_llm_call, update_llm_call_response, get_llm_call_by_id, get_llm_calls_by_job_id to persist call metadata, content, and usage.
Input Resolution & Job Integration
backend/app/services/llm/input_resolver.py, backend/app/services/llm/jobs.py, backend/app/services/llm/mappers.py
New input resolver for Text/AudioBase64/AudioURL; jobs create LLM call records before provider execution and update after; mapper updated to support Google params and dict-based Kaapi params.
LLM Models & Types
backend/app/models/llm/request.py, backend/app/models/llm/response.py, backend/app/models/llm/__init__.py, backend/app/models/__init__.py
Introduced typed param models, discriminated QueryInput union, ConfigBlob changes (completion type), new LlmCall SQLModel and export, and added reasoning_tokens to Usage.
Provider Base Changes
backend/app/services/llm/providers/base.py
Added abstract static create_client(credentials) and extended execute signature to accept resolved_input.
Jobs & Execution Flow
backend/app/services/llm/jobs.py
Jobs persist Job early, create/update LLM call records, resolve inputs, pass resolved input to provider, and cleanup temp files.
Tests & Test Helpers
backend/app/tests/..., backend/app/tests/crud/test_llm.py, backend/app/tests/services/llm/providers/test_gai.py, backend/app/tests/services/llm/test_mappers.py, backend/app/tests/utils/test_data.py
Extensive test additions/updates: Google provider STT tests, LLM CRUD tests, partial config-version tests (type immutability), mapper tests, and test helper changes to preserve provider/type across versions.
Misc & Formatting
backend/app/celery/beat.py, backend/app/celery/utils.py, backend/app/celery/worker.py, various test files
Minor formatting and import adjustments (blank lines, added imports for celery worker), consolidated test patch context managers.
Dependencies & Exports
backend/pyproject.toml, backend/app/models/__init__.py, backend/app/models/config/__init__.py, backend/app/tests/seed_data/seed_data.py
Added google-genai>=1.59.0; exported ConfigVersionCreatePartial, LlmCall; updated test seed imports.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client/Job
    participant JobSvc as Job Service
    participant Resolver as Input Resolver
    participant CRUD as LLM CRUD
    participant DB as Database
    participant Provider as LLM Provider
    participant ProviderAPI as Provider API

    Client->>JobSvc: execute_job(completion_config, query)
    JobSvc->>Resolver: resolve_input(query.input)
    alt Text
        Resolver-->>JobSvc: content_text
    else Audio Base64
        Resolver->>Resolver: decode & write temp file
        Resolver-->>JobSvc: temp_file_path
    else Audio URL
        Resolver->>ProviderAPI: fetch audio (SSRF-protected)
        Resolver->>Resolver: write temp file
        Resolver-->>JobSvc: temp_file_path
    end
    JobSvc->>CRUD: create_llm_call(..., resolved_input)
    CRUD->>DB: INSERT llm_call
    DB-->>CRUD: llm_call_id
    CRUD-->>JobSvc: LlmCall record
    JobSvc->>Provider: execute(completion_config, query, resolved_input)
    Provider->>ProviderAPI: provider request (with file/text)
    ProviderAPI-->>Provider: response (content, usage)
    Provider-->>JobSvc: LLMCallResponse
    JobSvc->>CRUD: update_llm_call_response(llm_call_id, response)
    CRUD->>DB: UPDATE llm_call
    JobSvc->>Resolver: cleanup_temp_file()
    JobSvc-->>Client: job_complete
Loading
sequenceDiagram
    participant User as User/API
    participant ConfigRoute as Config Route
    participant CRUDCfg as ConfigVersion CRUD
    participant DB as Database

    User->>ConfigRoute: PATCH /config/{id}/version (partial_blob)
    ConfigRoute->>CRUDCfg: create_from_partial_or_raise(partial_blob)
    CRUDCfg->>DB: SELECT latest ConfigVersion
    DB-->>CRUDCfg: latest_version
    CRUDCfg->>CRUDCfg: _deep_merge(latest.config_blob, partial_blob)
    CRUDCfg->>CRUDCfg: _validate_immutable_fields(type unchanged)
    alt Type Changed
        CRUDCfg-->>ConfigRoute: raise ValidationError
    else Valid
        CRUDCfg->>DB: validate merged blob
        alt Validation Fails
            CRUDCfg-->>ConfigRoute: raise ValidationError
        else
            CRUDCfg->>DB: INSERT new ConfigVersion
            DB-->>CRUDCfg: new record
            CRUDCfg-->>ConfigRoute: return ConfigVersion
        end
    end
    ConfigRoute-->>User: 200/400
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~65 minutes

Possibly related issues

Possibly related PRs

  • #477 — Overlaps with config version model and CRUD changes (ConfigVersionCreatePartial / create_from_partial_or_raise).
  • #498 — Related to LLM config/provider mapping and Kaapi→native transform changes.
  • #501 — Touches the same config version route and handler signature changes.

Suggested reviewers

  • kartpop
  • AkhileshNegi

🐰 Whiskers' Ode to the PR

A hop, a hum, a Google song begun,
Calls recorded, params merged — oh what fun!
Partial patches tuck old type in tight,
Providers sing and temp files take flight. 🎶

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: adding Speech-To-Text/Text-To-Speech support using Google Gemini in a unified API.
Docstring Coverage ✅ Passed Docstring coverage is 86.84% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/unified-api-stt-new

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@Prajna1999 Prajna1999 changed the title STT: Gemini STT Integration TTS/STT: speech-to-text using gemini in unified API Jan 21, 2026
@Prajna1999 Prajna1999 moved this to In Progress in Kaapi-dev Jan 21, 2026
@Prajna1999 Prajna1999 added the enhancement New feature or request label Jan 21, 2026
@Prajna1999 Prajna1999 marked this pull request as ready for review January 26, 2026 18:25
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 19

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (5)
backend/app/celery/beat.py (1)

12-21: Add return type and log prefix in start_beat.
Missing return annotation and required log prefix.

✅ Suggested fix
-def start_beat(loglevel: str = "info"):
+def start_beat(loglevel: str = "info") -> None:
@@
-    logger.info(f"Starting Celery beat scheduler")
+    logger.info("[start_beat] Starting Celery beat scheduler")
backend/app/tests/services/doctransformer/test_job/utils.py (1)

8-99: Add missing type hints for helper factories and callables.
Strict typing will flag these functions (and inner callables) without explicit annotations.

✅ Suggested fix
-from pathlib import Path
+from pathlib import Path
+from typing import Any, Callable, NoReturn
 from urllib.parse import urlparse
@@
-def create_failing_convert_document(fail_count: int = 1):
+def create_failing_convert_document(fail_count: int = 1) -> Callable[..., Path]:
@@
-    def failing_convert_document(*args, **kwargs):
+    def failing_convert_document(*args: Any, **kwargs: Any) -> Path:
@@
-def create_persistent_failing_convert_document(
-    error_message: str = "Persistent error",
-):
+def create_persistent_failing_convert_document(
+    error_message: str = "Persistent error",
+) -> Callable[..., NoReturn]:
@@
-    def persistent_failing_convert_document(*args, **kwargs):
+    def persistent_failing_convert_document(*args: Any, **kwargs: Any) -> NoReturn:
         raise Exception(error_message)
backend/app/celery/utils.py (1)

19-111: Add **kwargs type hints and prefix log messages with function names.
This aligns with strict typing and logging guidelines.

✅ Suggested fix
 def start_high_priority_job(
-    function_path: str, project_id: int, job_id: str, trace_id: str = "N/A", **kwargs
+    function_path: str,
+    project_id: int,
+    job_id: str,
+    trace_id: str = "N/A",
+    **kwargs: Any,
 ) -> str:
@@
-    logger.info(f"Started high priority job {job_id} with Celery task {task.id}")
+    logger.info(
+        f"[start_high_priority_job] Started high priority job {job_id} with Celery task {task.id}"
+    )
@@
 def start_low_priority_job(
-    function_path: str, project_id: int, job_id: str, trace_id: str = "N/A", **kwargs
+    function_path: str,
+    project_id: int,
+    job_id: str,
+    trace_id: str = "N/A",
+    **kwargs: Any,
 ) -> str:
@@
-    logger.info(f"Started low priority job {job_id} with Celery task {task.id}")
+    logger.info(
+        f"[start_low_priority_job] Started low priority job {job_id} with Celery task {task.id}"
+    )
@@
-        logger.info(f"Revoked task {task_id}")
+        logger.info(f"[revoke_task] Revoked task {task_id}")
         return True
     except Exception as e:
-        logger.error(f"Failed to revoke task {task_id}: {e}")
+        logger.error(f"[revoke_task] Failed to revoke task {task_id}: {e}")
         return False
backend/app/celery/worker.py (1)

14-41: Fix typing for concurrency/return and add log prefixes.
Ensures strict typing and consistent log formatting.

✅ Suggested fix
 def start_worker(
     queues: str = "default,high_priority,low_priority,cron",
-    concurrency: int = None,
+    concurrency: int | None = None,
     loglevel: str = "info",
-):
+) -> None:
@@
-    logger.info(f"Starting Celery worker with {concurrency} processes")
-    logger.info(f"Consuming queues: {queues}")
+    logger.info(f"[start_worker] Starting Celery worker with {concurrency} processes")
+    logger.info(f"[start_worker] Consuming queues: {queues}")
backend/app/services/llm/mappers.py (1)

7-76: Handle potential None model in litellm.supports_reasoning call.

On Line 39, if model is None (from kaapi_params.get("model")), the call litellm.supports_reasoning(model=f"openai/{model}") will pass "openai/None" which may cause unexpected behavior or errors from litellm.

🐛 Proposed fix
     model = kaapi_params.get("model")
     reasoning = kaapi_params.get("reasoning")
     temperature = kaapi_params.get("temperature")
     instructions = kaapi_params.get("instructions")
     knowledge_base_ids = kaapi_params.get("knowledge_base_ids")
     max_num_results = kaapi_params.get("max_num_results")

-    support_reasoning = litellm.supports_reasoning(model=f"openai/{model}")
+    support_reasoning = (
+        litellm.supports_reasoning(model=f"openai/{model}") if model else False
+    )
🤖 Fix all issues with AI agents
In `@backend/app/models/llm/request.py`:
- Around line 4-11: Remove the duplicate import of Field and SQLModel:
consolidate the two sqlmodel import lines into one (e.g., replace the two
occurrences of "from sqlmodel import Field, SQLModel" with a single line that
also includes Index and text if needed: "from sqlmodel import Field, SQLModel,
Index, text"), leaving other imports (pydantic, datetime, sqlalchemy, JSONB,
app.core.util) unchanged; ensure only one import statement provides Field and
SQLModel to fix the Ruff F811 duplicate-import error.
- Around line 313-479: The updated_at field on the LlmCall model currently uses
default_factory=now so it only sets at creation; make it auto-update on
modifications by adding an SQLAlchemy onupdate to the Column (e.g., sa_column or
sa_column_kwargs for updated_at with onupdate=now) or, if you prefer
application-level handling, ensure the update_llm_call_response CRUD function
(or any updater) sets updated_at = now() before committing. Update the
LlmCall.updated_at definition accordingly and/or modify update_llm_call_response
to assign now() on each update so updated_at reflects the last modification.

In `@backend/app/services/llm/input_resolver.py`:
- Around line 86-111: The resolve_audio_url function currently fetches arbitrary
URLs; before calling requests.get in resolve_audio_url, validate the input URL
by reusing validate_callback_url(url) (or at minimum enforce scheme == "https"
and use _is_private_ip to reject private/link-local IPs) and return an error
string on validation failure; also call requests.get with allow_redirects=False
to disable redirects and keep the existing timeout; keep existing temp file
write logic (references: resolve_audio_url, validate_callback_url,
_is_private_ip, get_file_extension).

In `@backend/app/services/llm/jobs.py`:
- Line 212: Replace the print call printing completion_config with a logger call
using the module logger (e.g., logger.debug or logger.info) instead of print;
log the same message text but prefixed with the current function name and
include the completion_config variable for context (in
backend/app/services/llm/jobs.py, at the location where completion_config is
printed), ensuring you use the existing module logger (or create one via
logging.getLogger(__name__)) and the appropriate log level rather than print.
- Around line 160-178: Remove the temporary debug block in execute_job that
performs an inline import of select and queries recent jobs when
session.get(Job, job_id) returns None; delete the extra session.exec(...) query,
the inline "from sqlmodel import select" import, and the verbose logger.error
that prints recent jobs, leaving only the initial logger.info that attempts to
fetch the job and the existing logger.error (or add a concise logger.error) for
the missing Job; ensure any needed diagnostics are moved to a dedicated utility
function (e.g., a new diagnostic helper) rather than inline in execute_job.
- Around line 299-302: The cleanup currently compares resolved_input (str) to
request.query.input (QueryInput) which is always true; change the finally block
to only call cleanup_temp_file(resolved_input) when the original input is an
audio-type input that created a temp file — e.g., check
isinstance(request.query.input, (AudioBase64Input, AudioUrlInput)) and
resolved_input is truthy before calling cleanup_temp_file; leave TextInput alone
so we don't attempt to treat text as a temp file.

In `@backend/app/services/llm/PLAN.md`:
- Around line 222-225: The example for the Field definition for conversation_id
has a missing comma after default=None causing a syntax error; update the
conversation_id Field invocation (the conversation_id variable and its use of
Field) to include the missing comma between default=None and
description="Identifier linking this response to its conversation thread" so the
Field call is properly separated and the snippet parses correctly.
- Around line 113-115: The log message in the example uses the wrong provider
name: update the logger.info call that currently says "[OpenAIProvider.execute]
Successfully generated response: {response.response_id}" to reference the
correct provider and method (e.g., "[GoogleAIProvider.execute] Successfully
generated response: {response.response_id}") so the log reflects
GoogleAIProvider.execute; locate the logger.info in the GoogleAIProvider.execute
example and change only the provider name in the message.

In `@backend/app/services/llm/providers/gai.py`:
- Around line 75-78: The lang_instruction assignment in the block that checks
input_language uses an unnecessary f-string prefix; update the two assignments
so they are plain strings (remove the leading 'f') for the branches where you
set lang_instruction (the conditional using input_language and the variable
lang_instruction).
- Around line 38-43: The _parse_input implementation only handles
completion_type "stt" and lacks type hints, causing implicit None returns;
update the method signature to include type hints (e.g., def _parse_input(self,
query_input: Any, completion_type: str, provider: str) -> str) and implement
explicit handling for non-"stt" completion types: validate and coerce/return a
string for other expected types (e.g., "chat" or "text"/"completion") or raise a
clear ValueError when the input shape is invalid; ensure every control path
returns a str and import any typing symbols used.
- Around line 32-36: OpenAIProvider.create_client currently returns an error
string when credentials are missing, causing a wrong type to be passed to the
constructor; change it to raise an exception instead to match
GoogleAIProvider.create_client's behavior (which raises ValueError). Update
OpenAIProvider.create_client to check for the required credential key (e.g.,
"api_key" or provider-specific name) and raise a ValueError with a clear message
when missing so the registry's exception handler receives an exception rather
than a string.
- Around line 55-125: The execute method in GoogleAIProvider only handles
completion_type == "stt" and falls through silently for other types; update
execute to explicitly handle unsupported completion types (e.g., "text" and
"tts") by returning a clear error (or implementing their flows) when
completion_type is not "stt". Locate the block using completion_type,
completion_config, and the STT flow where gemini_file/upload and
client.models.generate_content are used, and add an else branch (or early guard)
that returns (None, "<descriptive error>") or raises a descriptive exception
indicating unsupported completion_type so callers no longer get an implicit
(None, None).

In `@backend/app/services/llm/providers/oai.py`:
- Around line 32-36: The create_client staticmethod in OpenAIProvider
(create_client) currently returns an error string and uses an unnecessary
f-string; change it to mirror GoogleAIProvider.create_client by raising a
ValueError when "api_key" is missing (so callers receive an exception instead of
a string), and replace the f-string with a plain string literal; ensure the
method otherwise returns the OpenAI(...) client instance to keep the return type
consistent.

In `@backend/app/services/llm/providers/registry.py`:
- Around line 92-135: Remove the ad-hoc "__main__" test block from registry.py;
the block contains hardcoded paths and an outdated call signature. Extract the
logic that uses LLMProvider.get_provider_class, ProviderClass.create_client,
NativeCompletionConfig, QueryParams and instance.execute into a proper test
under backend/app/tests/services/llm/providers/test_gai.py (or delete it),
update the execute invocation to include the resolved_input parameter to match
the current signature, and ensure any credential/env handling is mocked rather
than reading real env vars or local file paths.
- Around line 66-70: There is a duplicated assignment of credential_provider
from provider_type using replace("-native", ""); remove the redundant line so
credential_provider is assigned only once (keep the first or the clearer
occurrence) and ensure any surrounding comments remain correct; locate the
duplicate by searching for the variable name credential_provider and the
expression provider_type.replace("-native", "") in registry.py and delete the
extra assignment.
- Around line 13-26: Remove the testing artifacts imported into the module:
delete the temporary import "from google.genai.types import
GenerateContentConfig", the block importing NativeCompletionConfig,
LLMCallResponse, QueryParams, LLMOutput, LLMResponse, Usage from app.models.llm
(if they are unused here), and the call to load_dotenv(); ensure any genuinely
required symbols for functions/classes in this file (e.g., registry-related
classes or functions) remain imported from their proper modules and move
environment loading to application startup code rather than leaving
load_dotenv() in this module.

In `@backend/app/tests/crud/test_llm.py`:
- Around line 269-271: The test passes an integer literal 99999 as project_id to
get_llm_call_by_id but project_id is a UUID; change the test to pass a UUID that
will not match the created LLM call (e.g., generate a new UUID via uuid.uuid4()
or use a different fixture UUID) instead of 99999 so the call uses the correct
type; update the assertion code around get_llm_call_by_id(db, created.id,
project_id=...) and ensure imports/fixtures provide a UUID value.

In `@backend/app/tests/services/llm/test_mappers.py`:
- Around line 1-317: The entire test suite in
backend/app/tests/services/llm/test_mappers.py is commented out; restore test
visibility by either re-enabling the tests or explicitly skipping them with a
reason and tracking link. Undo the block comment (or reintroduce the classes
TestMapKaapiToOpenAIParams and TestTransformKaapiConfigToNative) and run/fix
failing assertions against map_kaapi_to_openai_params and
transform_kaapi_config_to_native if they break due to the new type system, or if
you must temporarily disable, add pytest.mark.skip(reason="TODO: update tests
for new type system, see ISSUE-XXXXX") above each Test* class and add a TODO
comment referencing the issue; ensure the skip preserves the original test names
and imports so future fixes can target the exact failing assertions.

In `@backend/app/tests/utils/test_data.py`:
- Around line 339-373: latest_version may lack a "type" in its config_blob so
config_type can be None and later fail Literal validation; set a sensible
default (e.g., "text") when extracting it from completion_config and use that
default when constructing ConfigBlob instances for both NativeCompletionConfig
and KaapiCompletionConfig (update the assignment of config_type taken from
completion_config.get("type") to fallback to "text" and ensure
NativeCompletionConfig and KaapiCompletionConfig are always passed a non-None
type).
🧹 Nitpick comments (27)
backend/app/tests/scripts/test_backend_pre_start.py (1)

5-24: Add return type hint to the test function.

Per coding guidelines, all functions should have type hints for parameters and return values.

Suggested fix
-def test_init_success():
+def test_init_success() -> None:
backend/app/tests/scripts/test_test_pre_start.py (1)

5-24: Add return type hint to the test function.

Per coding guidelines, all functions should have type hints for parameters and return values.

Suggested fix
-def test_init_success():
+def test_init_success() -> None:
backend/app/cli/bench/commands.py (1)

169-175: Use Callable from typing instead of lowercase callable.

The callable on line 174 is a built-in function, not a type annotation. For proper static type checking, use Callable from the typing module with the appropriate signature.

Suggested fix
-from typing import List, Protocol
+from typing import Callable, List, Protocol
 def send_benchmark_request(
     prompt: str,
     i: int,
     total: int,
     endpoint: str,
-    build_payload: callable,
+    build_payload: Callable[[str], dict],
 ) -> BenchItem:

As per coding guidelines, type hints should be used throughout the codebase.

backend/app/tests/services/llm/providers/test_registry.py (1)

23-26: Consider adding test coverage for GoogleAIProvider in registry.

The registry tests only verify openai-native provider. With the addition of GoogleAIProvider, consider adding a test to verify google-native is also registered correctly in LLMProvider._registry.

backend/app/models/config/version.py (1)

99-116: Consider adding validation for empty config_blob.

ConfigVersionBase includes a validate_blob_not_empty validator (line 32-36), but ConfigVersionCreatePartial doesn't inherit from it or define its own validator. If passing an empty config_blob in a partial update is invalid, consider adding validation:

from pydantic import field_validator

`@field_validator`("config_blob")
def validate_blob_not_empty(cls, value):
    if not value:
        raise ValueError("config_blob cannot be empty")
    return value

If empty is intentionally allowed (e.g., the CRUD layer handles merging with existing config), this can be ignored.

backend/app/services/llm/__init__.py (1)

1-6: Consider consolidating imports from the same module.

Both import statements pull from app.services.llm.providers. They could be combined into a single import for cleaner organization:

♻️ Suggested refactor
-from app.services.llm.providers import BaseProvider, OpenAIProvider, GoogleAIProvider
-from app.services.llm.providers import (
-    LLMProvider,
-    get_llm_provider,
-)
+from app.services.llm.providers import (
+    BaseProvider,
+    GoogleAIProvider,
+    LLMProvider,
+    OpenAIProvider,
+    get_llm_provider,
+)
backend/app/tests/api/routes/test_llm.py (1)

145-167: Consider adding the type field to the invalid provider test payload.

The test_llm_call_invalid_provider test payload is missing the type field in the completion config. While this may still trigger a 422 due to the invalid provider, it could mask validation order issues. Consider adding "type": "text" to ensure the test specifically validates provider validation.

Suggested fix
     payload = {
         "query": {"input": "Test query"},
         "config": {
             "blob": {
                 "completion": {
                     "provider": "invalid-provider",
+                    "type": "text",
                     "params": {"model": "gpt-4"},
                 }
             }
         },
     }
backend/app/api/routes/config/version.py (4)

25-45: Add return type hint to create_version function.

Per coding guidelines, all function parameters and return values should have type hints. The function returns APIResponse[ConfigVersionPublic].

Suggested fix
 def create_version(
     config_id: UUID,
     version_create: ConfigVersionCreatePartial,
     current_user: AuthContextDep,
     session: SessionDep,
-):
+) -> APIResponse[ConfigVersionPublic]:

55-75: Add return type hint to list_versions function.

Per coding guidelines, all functions should have return type hints.

Suggested fix
 def list_versions(
     config_id: UUID,
     current_user: AuthContextDep,
     session: SessionDep,
     skip: int = Query(0, ge=0, description="Number of records to skip"),
     limit: int = Query(100, ge=1, le=100, description="Maximum records to return"),
-):
+) -> APIResponse[list[ConfigVersionItems]]:

85-102: Add return type hint to get_version function.

Per coding guidelines, all functions should have return type hints.

Suggested fix
 def get_version(
     config_id: UUID,
     current_user: AuthContextDep,
     session: SessionDep,
     version_number: int = Path(
         ..., ge=1, description="The version number of the config"
     ),
-):
+) -> APIResponse[ConfigVersionPublic]:

112-130: Add return type hint to delete_version function.

Per coding guidelines, all functions should have return type hints.

Suggested fix
 def delete_version(
     config_id: UUID,
     current_user: AuthContextDep,
     session: SessionDep,
     version_number: int = Path(
         ..., ge=1, description="The version number of the config"
     ),
-):
+) -> APIResponse[Message]:
backend/app/tests/crud/test_llm.py (2)

26-39: Consider using factory pattern for test fixtures.

Per coding guidelines, test fixtures in backend/app/tests/ should use the factory pattern. These fixtures query seed data directly rather than using factories. Consider creating factory functions for test projects and organizations if they don't already exist.


42-46: Add return type hint to test_job fixture.

Per coding guidelines, all functions should have type hints.

Suggested fix
 `@pytest.fixture`
-def test_job(db: Session):
+def test_job(db: Session) -> Job:
     """Create a test job for LLM call tests."""
     crud = JobCrud(db)
     return crud.create(job_type=JobType.LLM_API, trace_id="test-llm-trace")

Note: You'll need to import Job from the appropriate models module.

backend/app/tests/services/llm/test_jobs.py (1)

19-21: Remove commented-out import instead of leaving it.

The KaapiLLMParams import is commented out. If it's no longer needed, it should be removed entirely rather than left as a comment.

Suggested fix
 from app.models.llm import (
     LLMCallRequest,
     NativeCompletionConfig,
     QueryParams,
     LLMCallResponse,
     LLMResponse,
     LLMOutput,
     Usage,
-    # KaapiLLMParams,
     KaapiCompletionConfig,
 )
backend/app/services/llm/input_resolver.py (1)

88-96: Consider adding a streaming option for large audio files.

The current implementation loads the entire response into memory with response.content. For large audio files, this could cause memory issues. Consider using stream=True and writing chunks to the temp file.

Suggested streaming implementation
 def resolve_audio_url(url: str) -> tuple[str, str | None]:
     """Fetch audio from URL and write to temp file. Returns (file_path, error)."""
     try:
-        response = requests.get(url, timeout=60)
+        response = requests.get(url, timeout=60, stream=True)
         response.raise_for_status()
     except requests.Timeout:
         return "", f"Timeout fetching audio from URL: {url}"
     except requests.HTTPError as e:
         return "", f"HTTP error fetching audio: {e.response.status_code}"
     except Exception as e:
         return "", f"Failed to fetch audio from URL: {str(e)}"

     content_type = response.headers.get("content-type", "audio/wav")
     ext = get_file_extension(content_type.split(";")[0].strip())

     try:
         with tempfile.NamedTemporaryFile(
             suffix=ext, delete=False, prefix="audio_"
         ) as tmp:
-            tmp.write(response.content)
+            for chunk in response.iter_content(chunk_size=8192):
+                if chunk:
+                    tmp.write(chunk)
             temp_path = tmp.name

         logger.info(f"[resolve_audio_url] Wrote audio to temp file: {temp_path}")
         return temp_path, None
     except Exception as e:
         return "", f"Failed to write fetched audio to temp file: {str(e)}"
backend/app/tests/api/routes/configs/test_version.py (3)

565-570: Consider adding error message validation for consistency.

The test_create_version_cannot_change_type_from_text_to_stt test validates the error message content, but this test only checks the status code. Adding similar assertions would improve test coverage consistency.

     response = client.post(
         f"{settings.API_V1_STR}/configs/{config.id}/versions",
         headers={"X-API-KEY": user_api_key.key},
         json=version_data,
     )
     assert response.status_code == 400
+    error_detail = response.json().get("error", "")
+    assert "cannot change config type" in error_detail.lower()

734-738: Unused db parameter flagged by static analysis.

The db parameter is not used in this test function. If it's required for test fixture setup/teardown, consider adding a comment explaining its purpose. Otherwise, it can be removed.

 def test_create_config_with_kaapi_provider_success(
-    db: Session,
     client: TestClient,
     user_api_key: TestAuthContext,
 ) -> None:

477-478: Consider moving repeated imports to module level.

KaapiCompletionConfig is imported locally in multiple test functions. Moving it to the module-level imports (alongside NativeCompletionConfig at line 14) would reduce duplication.

backend/app/tests/utils/test_data.py (1)

321-337: Move imports to module level.

select and and_ from sqlmodel, and ConfigVersion are imported inside the function, but ConfigVersion is already imported at line 19. Consider consolidating these imports at the module level.

+from sqlmodel import Session, select, and_
 # ... at module level
 
 def create_test_version(...):
     if config_blob is None:
-        # Fetch the latest version to maintain type consistency
-        from sqlmodel import select, and_
-        from app.models import ConfigVersion
-
         stmt = (
             select(ConfigVersion)
backend/app/services/llm/providers/registry.py (1)

1-2: Remove unnecessary imports for production code.

os and dotenv are only used in the __main__ testing block. If the testing code is removed (as suggested below), these imports should also be removed.

backend/app/services/llm/providers/gai.py (1)

1-2: Remove unused import.

os is imported but never used in this file.

 import logging
-import os
backend/app/crud/config/version.py (2)

170-190: Shallow copy may cause unintended mutation of nested structures.

base.copy() on Line 178 creates a shallow copy. If base contains nested dicts that are not in updates, those nested dicts will be shared references. While the recursive merge handles overlapping keys correctly, any external mutation of the returned result could affect the original base dict.

Consider using copy.deepcopy() if the base dict may be reused or if nested structures need isolation.

♻️ Suggested fix using deepcopy
+import copy
+
 def _deep_merge(
     self, base: dict[str, Any], updates: dict[str, Any]
 ) -> dict[str, Any]:
     """
     Deep merge two dictionaries.
     Values from 'updates' override values in 'base'.
     Nested dicts are merged recursively.
     """
-    result = base.copy()
+    result = copy.deepcopy(base)

     for key, value in updates.items():
         if (
             key in result
             and isinstance(result[key], dict)
             and isinstance(value, dict)
         ):
             result[key] = self._deep_merge(result[key], value)
         else:
             result[key] = value

     return result

291-341: Consider reusing _get_latest_version to reduce code duplication.

The query logic in Lines 300-311 duplicates the query in _get_latest_version (Lines 157-168). This creates maintenance burden and potential for divergence.

♻️ Proposed refactor to reuse existing helper
 def _validate_config_type_unchanged(
     self, version_create: ConfigVersionCreate
 ) -> None:
     """
     Validate that the config type (text/stt/tts) in the new version matches
     the type from the latest existing version.
     Raises HTTPException if types don't match.
     """
-    # Get the latest version
-    stmt = (
-        select(ConfigVersion)
-        .where(
-            and_(
-                ConfigVersion.config_id == self.config_id,
-                ConfigVersion.deleted_at.is_(None),
-            )
-        )
-        .order_by(ConfigVersion.version.desc())
-        .limit(1)
-    )
-    latest_version = self.session.exec(stmt).first()
+    latest_version = self._get_latest_version()

     # If this is the first version, no validation needed
     if latest_version is None:
         return
backend/app/crud/llm.py (3)

80-105: Potential AttributeError when accessing params.model.

Line 102-105 uses hasattr to check for model attribute but assumes completion_config.params exists and is accessible. If params is a dict (which it can be based on NativeCompletionConfig.params: dict[str, Any]), hasattr(completion_config.params, "model") will return False even when "model" is a key, and .get("model", "") should work. However, mixing attribute access with dict access creates confusion.

♻️ Suggested simplification for clarity
-    model = (
-        completion_config.params.model
-        if hasattr(completion_config.params, "model")
-        else completion_config.params.get("model", "")
-    )
+    # params is always a dict for both Native and Kaapi configs
+    model = completion_config.params.get("model", "")

99-100: Type ignore comment suggests potential type mismatch.

The # type: ignore[assignment] comment on Line 100 indicates the provider type doesn't match the expected Literal["openai", "google", "anthropic"]. The comment mentions provider is "guaranteed to be normalized" but this isn't enforced at the type level. Consider adding a runtime assertion or documenting the contract more explicitly.


200-229: Add type hint for return value on get_llm_calls_by_job_id.

The function is missing a return type annotation per the coding guidelines requiring type hints on all function parameters and return values.

♻️ Proposed fix
 def get_llm_calls_by_job_id(
     session: Session,
     job_id: UUID,
-) -> list[LlmCall]:
+) -> list[LlmCall]:

Actually, looking again, the return type is already specified. The function signatures look complete.

LGTM!

Query functions are well-typed and properly filter out soft-deleted records.

backend/app/models/llm/request.py (1)

63-63: Consider using X | Y union syntax for consistency.

Static analysis (UP007) suggests using X | Y instead of Union[...] for type annotations to align with modern Python 3.11+ syntax. This is optional but improves consistency with other type hints in the codebase.

♻️ Proposed fix
-KaapiLLMParams = Union[TextLLMParams, STTLLMParams, TTSLLMParams]
+KaapiLLMParams = TextLLMParams | STTLLMParams | TTSLLMParams

 # Discriminated union for query input types
-QueryInput = Annotated[
-    Union[TextInput, AudioBase64Input, AudioUrlInput],
-    Field(discriminator="type"),
-]
+QueryInput = Annotated[
+    TextInput | AudioBase64Input | AudioUrlInput,
+    Field(discriminator="type"),
+]

Also applies to: 87-90

Comment on lines 4 to +11
from sqlmodel import Field, SQLModel
from pydantic import Discriminator, model_validator, HttpUrl
from datetime import datetime
from app.core.util import now

import sqlalchemy as sa
from sqlalchemy.dialects.postgresql import JSONB
from sqlmodel import Field, SQLModel, Index, text
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove duplicate imports flagged by static analysis.

Lines 4 and 11 both import Field and SQLModel from sqlmodel. This is flagged by Ruff (F811).

🐛 Proposed fix
 from uuid import UUID, uuid4
-from sqlmodel import Field, SQLModel
 from pydantic import Discriminator, model_validator, HttpUrl
 from datetime import datetime
 from app.core.util import now

 import sqlalchemy as sa
 from sqlalchemy.dialects.postgresql import JSONB
 from sqlmodel import Field, SQLModel, Index, text
🧰 Tools
🪛 Ruff (0.14.14)

11-11: Redefinition of unused Field from line 4: Field redefined here

Remove definition: Field

(F811)


11-11: Redefinition of unused SQLModel from line 4: SQLModel redefined here

Remove definition: SQLModel

(F811)

🤖 Prompt for AI Agents
In `@backend/app/models/llm/request.py` around lines 4 - 11, Remove the duplicate
import of Field and SQLModel: consolidate the two sqlmodel import lines into one
(e.g., replace the two occurrences of "from sqlmodel import Field, SQLModel"
with a single line that also includes Index and text if needed: "from sqlmodel
import Field, SQLModel, Index, text"), leaving other imports (pydantic,
datetime, sqlalchemy, JSONB, app.core.util) unchanged; ensure only one import
statement provides Field and SQLModel to fix the Ruff F811 duplicate-import
error.

Comment on lines +160 to +178

# Debug: Try to fetch the job first
logger.info(f"[execute_job] Attempting to fetch job | job_id={job_id}")
job = session.get(Job, job_id)
if not job:
# Log all jobs to see what's in the database
from sqlmodel import select

all_jobs = session.exec(
select(Job).order_by(Job.created_at.desc()).limit(5)
).all()
logger.error(
f"[execute_job] Job not found! | job_id={job_id} | "
f"Recent jobs in DB: {[(j.id, j.status) for j in all_jobs]}"
)
else:
logger.info(
f"[execute_job] Found job | job_id={job_id}, status={job.status}"
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove debug code before merging.

This block appears to be debugging code that queries all recent jobs when the expected job isn't found. It adds unnecessary database queries in production and includes an inline import. Consider removing this or converting it to a proper diagnostic utility if this scenario needs monitoring.

🔧 Suggested removal
             job_crud = JobCrud(session=session)
-
-            # Debug: Try to fetch the job first
-            logger.info(f"[execute_job] Attempting to fetch job | job_id={job_id}")
-            job = session.get(Job, job_id)
-            if not job:
-                # Log all jobs to see what's in the database
-                from sqlmodel import select
-
-                all_jobs = session.exec(
-                    select(Job).order_by(Job.created_at.desc()).limit(5)
-                ).all()
-                logger.error(
-                    f"[execute_job] Job not found! | job_id={job_id} | "
-                    f"Recent jobs in DB: {[(j.id, j.status) for j in all_jobs]}"
-                )
-            else:
-                logger.info(
-                    f"[execute_job] Found job | job_id={job_id}, status={job.status}"
-                )
-
             job_crud.update(
🤖 Prompt for AI Agents
In `@backend/app/services/llm/jobs.py` around lines 160 - 178, Remove the
temporary debug block in execute_job that performs an inline import of select
and queries recent jobs when session.get(Job, job_id) returns None; delete the
extra session.exec(...) query, the inline "from sqlmodel import select" import,
and the verbose logger.error that prints recent jobs, leaving only the initial
logger.info that attempts to fetch the job and the existing logger.error (or add
a concise logger.error) for the missing Job; ensure any needed diagnostics are
moved to a dedicated utility function (e.g., a new diagnostic helper) rather
than inline in execute_job.

completion_config, warnings = transform_kaapi_config_to_native(
completion_config
)
print(f"The completion_config transformed is {completion_config}")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Replace print with logger call.

Debug output should use the logger for consistency and proper log level control. As per coding guidelines, log messages should be prefixed with the function name.

-                    print(f"The completion_config transformed is {completion_config}")
+                    logger.debug(f"[execute_job] Transformed completion_config: {completion_config}")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
print(f"The completion_config transformed is {completion_config}")
logger.debug(f"[execute_job] Transformed completion_config: {completion_config}")
🤖 Prompt for AI Agents
In `@backend/app/services/llm/jobs.py` at line 212, Replace the print call
printing completion_config with a logger call using the module logger (e.g.,
logger.debug or logger.info) instead of print; log the same message text but
prefixed with the current function name and include the completion_config
variable for context (in backend/app/services/llm/jobs.py, at the location where
completion_config is printed), ensuring you use the existing module logger (or
create one via logging.getLogger(__name__)) and the appropriate log level rather
than print.

Comment on lines +66 to 70
# e.g "openai-native" -> "openai", "claude-native" -> "claude"
credential_provider = provider_type.replace("-native", "")

# e.g., "openai-native" → "openai", "claude-native" → "claude"
credential_provider = provider_type.replace("-native", "")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove duplicate variable assignment.

credential_provider is assigned twice with identical logic. Remove the duplicate.

     provider_class = LLMProvider.get_provider_class(provider_type)

     # e.g "openai-native" -> "openai", "claude-native" -> "claude"
     credential_provider = provider_type.replace("-native", "")

-    # e.g., "openai-native" → "openai", "claude-native" → "claude"
-    credential_provider = provider_type.replace("-native", "")
-
     credentials = get_provider_credential(
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# e.g "openai-native" -> "openai", "claude-native" -> "claude"
credential_provider = provider_type.replace("-native", "")
# e.g., "openai-native" → "openai", "claude-native" → "claude"
credential_provider = provider_type.replace("-native", "")
provider_class = LLMProvider.get_provider_class(provider_type)
# e.g "openai-native" -> "openai", "claude-native" -> "claude"
credential_provider = provider_type.replace("-native", "")
credentials = get_provider_credential(
🤖 Prompt for AI Agents
In `@backend/app/services/llm/providers/registry.py` around lines 66 - 70, There
is a duplicated assignment of credential_provider from provider_type using
replace("-native", ""); remove the redundant line so credential_provider is
assigned only once (keep the first or the clearer occurrence) and ensure any
surrounding comments remain correct; locate the duplicate by searching for the
variable name credential_provider and the expression
provider_type.replace("-native", "") in registry.py and delete the extra
assignment.

Comment on lines +92 to +135
# ad hoc testing code
if __name__ == "__main__":
# 1. Simulate environment/credentials
GEMINI_KEY = os.getenv("GEMINI_API_KEY")
if not GEMINI_KEY:
print("Set GEMINI_API_KEY environment variable first.")
exit(1)

# This dictionary mimics what get_provider_credential would return from the DB
mock_credentials = {"api_key": GEMINI_KEY}

# 2. Idiomatic Initialization via Registry
provider_type = "google-native"
# provider_type=LLMProvider.get_provider_class(provider_type="GOOGLE-NATIVE")

print(f"Initializing provider: {provider_type}...")

# This block mimics the core logic of your get_llm_provider function
ProviderClass = LLMProvider.get_provider_class(provider_type)
client = ProviderClass.create_client(credentials=mock_credentials)
instance = ProviderClass(client=client)

# 3. Setup Config and Query
test_config = NativeCompletionConfig(
provider="google-native",
type="stt",
params={
"model": "gemini-2.5-pro",
"instructions": "Please transcribe this audio accurately.",
},
)

test_query = QueryParams(
input="/Users/prajna/Desktop/personal/projects/software/Syspin_Hackathon_api_server/wav_files/1253534463206645.wav" # Ensure this file exists in your directory
)

# 4. Execution
print("Executing STT...")
result, error = instance.execute(completion_config=test_config, query=test_query)

return provider_class(client=client)
if error:
print(f"Error: {error}")
else:
print(f"Result: {result}")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove ad hoc testing code from production module.

This __main__ block contains:

  1. Hardcoded local file paths (line 125) that won't work for other developers
  2. Outdated execute() call signature missing the resolved_input parameter (line 130)
  3. Testing logic that belongs in a dedicated test file

Consider moving this to backend/app/tests/services/llm/providers/test_gai.py or removing it entirely.

🤖 Prompt for AI Agents
In `@backend/app/services/llm/providers/registry.py` around lines 92 - 135, Remove
the ad-hoc "__main__" test block from registry.py; the block contains hardcoded
paths and an outdated call signature. Extract the logic that uses
LLMProvider.get_provider_class, ProviderClass.create_client,
NativeCompletionConfig, QueryParams and instance.execute into a proper test
under backend/app/tests/services/llm/providers/test_gai.py (or delete it),
update the execute invocation to include the resolved_input parameter to match
the current signature, and ensure any credential/env handling is mocked rather
than reading real env vars or local file paths.

Comment on lines +269 to +271
# Should not find with wrong project
fetched_wrong = get_llm_call_by_id(db, created.id, project_id=99999)
assert fetched_wrong is None
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Type mismatch: project_id should be UUID, not int.

The test uses project_id=99999 (an integer), but based on other usages in the file (e.g., test_project.id), project_id appears to be a UUID type. This could cause a type error or unexpected behavior.

Suggested fix
     # Should not find with wrong project
-    fetched_wrong = get_llm_call_by_id(db, created.id, project_id=99999)
+    fetched_wrong = get_llm_call_by_id(db, created.id, project_id=uuid4())
     assert fetched_wrong is None
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Should not find with wrong project
fetched_wrong = get_llm_call_by_id(db, created.id, project_id=99999)
assert fetched_wrong is None
# Should not find with wrong project
fetched_wrong = get_llm_call_by_id(db, created.id, project_id=uuid4())
assert fetched_wrong is None
🤖 Prompt for AI Agents
In `@backend/app/tests/crud/test_llm.py` around lines 269 - 271, The test passes
an integer literal 99999 as project_id to get_llm_call_by_id but project_id is a
UUID; change the test to pass a UUID that will not match the created LLM call
(e.g., generate a new UUID via uuid.uuid4() or use a different fixture UUID)
instead of 99999 so the call uses the correct type; update the assertion code
around get_llm_call_by_id(db, created.id, project_id=...) and ensure
imports/fixtures provide a UUID value.

Comment on lines +339 to +373
if latest_version:
# Extract the type and provider from the latest version
completion_config = latest_version.config_blob.get("completion", {})
config_type = completion_config.get("type")
provider = completion_config.get("provider", "openai-native")

# Create a new config_blob maintaining the same type and provider
if provider in ["openai-native", "google-native"]:
config_blob = ConfigBlob(
completion=NativeCompletionConfig(
provider=provider,
type=config_type,
params={
"model": completion_config.get("params", {}).get(
"model", "gpt-4"
),
"temperature": 0.8,
"max_tokens": 1500,
},
)
)
else:
# For Kaapi providers (openai, google)
config_blob = ConfigBlob(
completion=KaapiCompletionConfig(
provider=provider,
type=config_type,
params={
"model": completion_config.get("params", {}).get(
"model", "gpt-4"
),
"temperature": 0.8,
},
)
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Missing default for config_type could cause validation failure.

If latest_version.config_blob doesn't contain a type field (e.g., legacy data), config_type will be None, which would fail the Literal["text", "stt", "tts"] validation when passed to NativeCompletionConfig or KaapiCompletionConfig.

🔧 Suggested fix
         if latest_version:
             # Extract the type and provider from the latest version
             completion_config = latest_version.config_blob.get("completion", {})
-            config_type = completion_config.get("type")
+            config_type = completion_config.get("type", "text")  # Default to "text" for legacy data
             provider = completion_config.get("provider", "openai-native")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if latest_version:
# Extract the type and provider from the latest version
completion_config = latest_version.config_blob.get("completion", {})
config_type = completion_config.get("type")
provider = completion_config.get("provider", "openai-native")
# Create a new config_blob maintaining the same type and provider
if provider in ["openai-native", "google-native"]:
config_blob = ConfigBlob(
completion=NativeCompletionConfig(
provider=provider,
type=config_type,
params={
"model": completion_config.get("params", {}).get(
"model", "gpt-4"
),
"temperature": 0.8,
"max_tokens": 1500,
},
)
)
else:
# For Kaapi providers (openai, google)
config_blob = ConfigBlob(
completion=KaapiCompletionConfig(
provider=provider,
type=config_type,
params={
"model": completion_config.get("params", {}).get(
"model", "gpt-4"
),
"temperature": 0.8,
},
)
)
if latest_version:
# Extract the type and provider from the latest version
completion_config = latest_version.config_blob.get("completion", {})
config_type = completion_config.get("type", "text") # Default to "text" for legacy data
provider = completion_config.get("provider", "openai-native")
# Create a new config_blob maintaining the same type and provider
if provider in ["openai-native", "google-native"]:
config_blob = ConfigBlob(
completion=NativeCompletionConfig(
provider=provider,
type=config_type,
params={
"model": completion_config.get("params", {}).get(
"model", "gpt-4"
),
"temperature": 0.8,
"max_tokens": 1500,
},
)
)
else:
# For Kaapi providers (openai, google)
config_blob = ConfigBlob(
completion=KaapiCompletionConfig(
provider=provider,
type=config_type,
params={
"model": completion_config.get("params", {}).get(
"model", "gpt-4"
),
"temperature": 0.8,
},
)
)
🤖 Prompt for AI Agents
In `@backend/app/tests/utils/test_data.py` around lines 339 - 373, latest_version
may lack a "type" in its config_blob so config_type can be None and later fail
Literal validation; set a sensible default (e.g., "text") when extracting it
from completion_config and use that default when constructing ConfigBlob
instances for both NativeCompletionConfig and KaapiCompletionConfig (update the
assignment of config_type taken from completion_config.get("type") to fallback
to "text" and ensure NativeCompletionConfig and KaapiCompletionConfig are always
passed a non-None type).

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@backend/app/models/llm/request.py`:
- Around line 395-402: The LlmCall.provider Literal includes "anthropic" but
there is no anthropic provider implementation in the provider registry (only
"openai" and "google"), so either remove "anthropic" from the Literal on the
provider Field in the LlmCall model or prevent invalid values from being
persisted: update the provider Field definition in LlmCall.provider to remove
"anthropic" (so Literal["openai","google"]) OR add an explicit Pydantic
validator on LlmCall.provider that checks the value against the actual provider
registry (the same registry lookup used at runtime) and raises a ValidationError
if "anthropic" (or any unsupported provider) is supplied; if you plan to
implement anthropic, instead implement and register the anthropic provider in
the provider registry so lookups succeed.

In `@backend/app/tests/seed_data/seed_data.py`:
- Around line 21-22: Remove the unused imports Config and ConfigVersion from the
top of this module to clean up imports; locate the import list in seed_data.py
where Config and ConfigVersion are imported alongside other models and delete
those two symbols (or defer importing them only when you add
create_config/create_config_version seed functions), ensuring functions like
clear_database and existing seed_... helpers remain unchanged.
🧹 Nitpick comments (1)
backend/app/services/llm/providers/oai.py (1)

7-7: Consider grouping typing imports with standard library imports.

The typing import should be placed before third-party imports (like openai) per PEP 8 import ordering conventions.

+from typing import Any
+
 import logging
 
 import openai
 from openai import OpenAI
 from openai.types.responses.response import Response
 
-from typing import Any
 from app.models.llm import (

Comment on lines +21 to +22
Config,
ConfigVersion,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Unused imports: Config and ConfigVersion are not referenced in this file.

Neither Config nor ConfigVersion is used anywhere in this module—no seed functions create these entities, and they don't appear in clear_database or elsewhere. If these are placeholders for upcoming seed logic, consider deferring the import until the corresponding create_config/create_config_version functions are added; otherwise, remove them to keep imports clean.

🤖 Prompt for AI Agents
In `@backend/app/tests/seed_data/seed_data.py` around lines 21 - 22, Remove the
unused imports Config and ConfigVersion from the top of this module to clean up
imports; locate the import list in seed_data.py where Config and ConfigVersion
are imported alongside other models and delete those two symbols (or defer
importing them only when you add create_config/create_config_version seed
functions), ensuring functions like clear_database and existing seed_... helpers
remain unchanged.

@Prajna1999 Prajna1999 changed the title TTS/STT: speech-to-text using gemini in unified API TTS/STT: Speech-To-Text Using Gemini in Unified API Jan 27, 2026
- NativeCompletionConfig with provider-native parameters ready for API
- List of warnings for suppressed/ignored parameters
"""
if kaapi_config.provider == "openai":
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#TODO convert this enums from string to proper enum types.
#TODO force per provider per type model, have a dictionary of sorts. Config YAML, dict etc. research

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@backend/app/services/llm/providers/gai.py`:
- Around line 116-121: The Usage construction is dereferencing
response.usage_metadata fields directly which can raise AttributeError; update
the code that builds the Usage(...) (the Usage class instantiation using
response.usage_metadata.prompt_token_count, candidates_token_count,
total_token_count, thoughts_token_count) to first check if
response.usage_metadata is not None and then read each field (falling back to 0
or None as your convention) before passing values into Usage; i.e., compute safe
local values from response.usage_metadata with null checks and use those
variables in the Usage(...) call so missing usage_metadata cannot cause an
exception.
- Line 97: The uploaded gemini_file created via
self.client.files.upload(file=parsed_input) is never removed; update the logic
in the surrounding method (where gemini_file is used) to delete the remote file
with self.client.files.delete(name=gemini_file.name) after it's no longer
needed—preferably in a finally block to guarantee cleanup even on errors, and
handle/log any exceptions from self.client.files.delete so failures to clean up
don't crash the flow.
🧹 Nitpick comments (3)
backend/app/services/llm/providers/gai.py (3)

2-2: Unused import os.

The os module is imported but never used in this file.

🧹 Suggested fix
 import logging
-import os
 
 from google import genai

99-101: Redundant condition check.

The if merged_instruction: check on line 100 is always true since merged_instruction is unconditionally assigned a non-empty string on either line 92 or 94.

♻️ Simplified code
         contents = []
-        if merged_instruction:
-            contents.append(merged_instruction)
+        contents.append(merged_instruction)
         contents.append(gemini_file)

133-139: Remove unused query parameter or document its purpose.

The query: QueryParams parameter is accepted but never used in this method's implementation. While other providers (e.g., OpenAIProvider) use query.conversation, GoogleAIProvider.execute does not reference it. Either remove the parameter (updating the interface as needed) or add a comment explaining why it's retained as part of the interface contract.

merged_instruction = f"{lang_instruction}."

# Upload file and generate content
gemini_file = self.client.files.upload(file=parsed_input)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's examine the gai.py file to understand the context
cat -n backend/app/services/llm/providers/gai.py | head -150

Repository: ProjectTech4DevAI/kaapi-backend

Length of output: 6166


🌐 Web query:

Google GenAI Python SDK file.delete() API documentation

💡 Result:

Google GenAI Python SDK — client.files.delete() (aka file.delete())

Package / install: google-genai (pip install google-genai). (ai.google.dev)

Python method (SDK):

  • Call: client.files.delete(name=..., config=None)
  • Purpose: Deletes a remotely stored file.
  • Params:
    • name (str, required): file resource name like files/abc-123
    • config (DeleteFileConfig, optional)
  • Returns: DeleteFileResponse
  • Async: await client.aio.files.delete(name=..., config=None) (googleapis.github.io)

Underlying REST API (Gemini API):

  • DELETE https://generativelanguage.googleapis.com/v1beta/{name=files/*}
  • Request body is empty. (ai.google.dev)

Example (Python):

from google import genai

client = genai.Client()
myfile = client.files.upload(file="poem.txt")
client.files.delete(name=myfile.name)  # e.g., "files/abc-123"

(ai.google.dev)

References

  1. SDK genai.files.Files.delete / AsyncFiles.delete docs. (googleapis.github.io)
  2. Gemini API “Method: files.delete” (endpoint + example). (ai.google.dev)
  3. Google GenAI SDK install info (google-genai). (ai.google.dev)

Citations:


🏁 Script executed:

# Check if files.delete is used anywhere else in the codebase
rg "files\.delete" --type py

# Check if there are any other file cleanup patterns
rg "\.delete\(" backend/app/services/llm/ --type py -B 2 -A 2

# Verify usage_metadata null checks in the codebase
rg "usage_metadata" --type py -B 1 -A 1

Repository: ProjectTech4DevAI/kaapi-backend

Length of output: 1245


🏁 Script executed:

# Check if there are any exception handling patterns around files.upload in the codebase
rg "files\.upload" --type py -B 3 -A 5

Repository: ProjectTech4DevAI/kaapi-backend

Length of output: 3310


Missing cleanup for uploaded file.

The uploaded gemini_file is never deleted after use. This could lead to accumulation of files in Google's storage, potentially incurring costs and hitting quota limits. The Google GenAI SDK supports file deletion via client.files.delete(name=...).

🧹 Suggested fix
         # Upload file and generate content
         gemini_file = self.client.files.upload(file=parsed_input)
+        try:
+            contents = []
+            if merged_instruction:
+                contents.append(merged_instruction)
+            contents.append(gemini_file)
 
-        contents = []
-        if merged_instruction:
-            contents.append(merged_instruction)
-        contents.append(gemini_file)
+            response: GenerateContentResponse = self.client.models.generate_content(
+                model=model, contents=contents
+            )
+        finally:
+            try:
+                self.client.files.delete(name=gemini_file.name)
+            except Exception as cleanup_error:
+                logger.warning(
+                    f"[GoogleAIProvider._execute_stt] Failed to cleanup uploaded file: {cleanup_error}"
+                )
 
-        response: GenerateContentResponse = self.client.models.generate_content(
-            model=model, contents=contents
-        )
🤖 Prompt for AI Agents
In `@backend/app/services/llm/providers/gai.py` at line 97, The uploaded
gemini_file created via self.client.files.upload(file=parsed_input) is never
removed; update the logic in the surrounding method (where gemini_file is used)
to delete the remote file with self.client.files.delete(name=gemini_file.name)
after it's no longer needed—preferably in a finally block to guarantee cleanup
even on errors, and handle/log any exceptions from self.client.files.delete so
failures to clean up don't crash the flow.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@backend/app/services/llm/providers/registry.py`:
- Around line 92-94: Update the except block that calls logger.error (the one
logging "Failed to initialize {provider_type} client") to prefix the message
with the current function name in square brackets; import inspect if needed and
change the log call to something like
logger.error(f"[{inspect.currentframe().f_code.co_name}] Failed to initialize
{provider_type} client: {e}", exc_info=True) so the function name appears before
the message, keeping the existing exc_info and the subsequent raise RuntimeError
unchanged.
🧹 Nitpick comments (2)
backend/pyproject.toml (1)

39-39: google-genai>=1.59.0 is compatible with Python 3.12 and can be pinned with an upper bound for stability.

Version 1.59.0 exists and is compatible with Python 3.12 (requires Python >=3.10). Consider adding an upper bound to avoid unplanned breaking changes in future releases.

♻️ Optional upper-bound pin
-    "google-genai>=1.59.0",
+    "google-genai>=1.59.0,<2.0.0",
backend/app/services/llm/jobs.py (1)

223-227: Consider using model_copy() for forward-compatibility when reconstructing ConfigBlob.

While ConfigBlob currently contains only the completion field, using model_copy(update={"completion": completion_config}) would be more maintainable if additional fields are added in the future (the comments indicate classifier and pre_filter are planned). The current approach is functionally correct, but the suggested defensive pattern would ensure any future fields are preserved during the reconstruction.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Fix all issues with AI agents
In `@backend/app/alembic/versions/042_add_llm_call_table.py`:
- Around line 1-6: The migration docstring at the top of the file is out of sync
with the module constants: update the docstring header lines that currently read
"Revision ID: 041" and "Revises: 040" so they match the declared module-level
constants revision = "042" and down_revision = "041" (i.e., change the docstring
to Revision ID: 042 and Revises: 041) so the header and the revision constants
(revision / down_revision) are consistent.
- Around line 20-160: The function definition for upgrade is missing a return
type; update the def upgrade() signature to include an explicit return type (->
None) so it reads def upgrade() -> None: and ensure the function body remains
unchanged; apply the same pattern for any Alembic migration functions (e.g.,
downgrade) in this file if present to satisfy typing guidelines.

In `@backend/app/services/llm/providers/gai.py`:
- Around line 90-95: The file fails Black formatting around the string assembly
for forced_trascription_text and merged_instruction; run Black (or pre-commit)
to reformat the file so spacing/line breaks match Black style, and ensure the
variables forced_trascription_text, merged_instruction, instructions, and
lang_instruction remain unchanged semantically while fixing formatting
inconsistencies (e.g., string concatenation/spacing and line length) in the
function that builds merged_instruction.
- Around line 23-31: Update the constructor signature for __init__ in
backend/app/services/llm/providers/gai.py to include an explicit return type
annotation -> None (i.e., change "def __init__(self, client: genai.Client):" to
"def __init__(self, client: genai.Client) -> None:") and keep the body unchanged
(retain super().__init__(client) and self.client = client) so the initializer
conforms to the project's type-hinting guidelines.

Comment on lines +20 to +160
def upgrade():
# ### commands auto generated by Alembic - please adjust! ###
op.create_table(
"llm_call",
sa.Column(
"id",
sa.Uuid(),
nullable=False,
comment="Unique identifier for the LLM call record",
),
sa.Column(
"job_id",
sa.Uuid(),
nullable=False,
comment="Reference to the parent job (status tracked in job table)",
),
sa.Column(
"project_id",
sa.Integer(),
nullable=False,
comment="Reference to the project this LLM call belongs to",
),
sa.Column(
"organization_id",
sa.Integer(),
nullable=False,
comment="Reference to the organization this LLM call belongs to",
),
sa.Column(
"input",
sqlmodel.sql.sqltypes.AutoString(),
nullable=False,
comment="User input - text string, binary data, or file path for multimodal",
),
sa.Column(
"input_type",
sa.String(),
nullable=False,
comment="Input type: text, audio, image",
),
sa.Column(
"output_type",
sa.String(),
nullable=True,
comment="Expected output type: text, audio, image",
),
sa.Column(
"provider",
sa.String(),
nullable=False,
comment="AI provider: openai, google, anthropic",
),
sa.Column(
"model",
sqlmodel.sql.sqltypes.AutoString(),
nullable=False,
comment="Specific model used e.g. 'gpt-4o', 'gemini-2.5-pro'",
),
sa.Column(
"provider_response_id",
sqlmodel.sql.sqltypes.AutoString(),
nullable=True,
comment="Original response ID from the provider (e.g., OpenAI's response ID)",
),
sa.Column(
"content",
postgresql.JSONB(astext_type=sa.Text()),
nullable=True,
comment="Response content: {text: '...'}, {audio_bytes: '...'}, or {image: '...'}",
),
sa.Column(
"usage",
postgresql.JSONB(astext_type=sa.Text()),
nullable=True,
comment="Token usage: {input_tokens, output_tokens, reasoning_tokens}",
),
sa.Column(
"conversation_id",
sqlmodel.sql.sqltypes.AutoString(),
nullable=True,
comment="Identifier linking this response to its conversation thread",
),
sa.Column(
"auto_create",
sa.Boolean(),
nullable=True,
comment="Whether to auto-create conversation if conversation_id doesn't exist (OpenAI specific)",
),
sa.Column(
"config",
postgresql.JSONB(astext_type=sa.Text()),
nullable=True,
comment="Configuration: {config_id, config_version} for stored config OR {config_blob} for ad-hoc config",
),
sa.Column(
"created_at",
sa.DateTime(),
nullable=False,
comment="Timestamp when the LLM call was created",
),
sa.Column(
"updated_at",
sa.DateTime(),
nullable=False,
comment="Timestamp when the LLM call was last updated",
),
sa.Column(
"deleted_at",
sa.DateTime(),
nullable=True,
comment="Timestamp when the record was soft-deleted",
),
sa.ForeignKeyConstraint(["job_id"], ["job.id"], ondelete="CASCADE"),
sa.ForeignKeyConstraint(
["organization_id"], ["organization.id"], ondelete="CASCADE"
),
sa.ForeignKeyConstraint(["project_id"], ["project.id"], ondelete="CASCADE"),
sa.PrimaryKeyConstraint("id"),
)
op.create_index(
"idx_llm_call_conversation_id",
"llm_call",
["conversation_id"],
unique=False,
postgresql_where=sa.text("conversation_id IS NOT NULL AND deleted_at IS NULL"),
)
op.create_index(
"idx_llm_call_job_id",
"llm_call",
["job_id"],
unique=False,
postgresql_where=sa.text("deleted_at IS NULL"),
)
op.alter_column(
"collection",
"llm_service_name",
existing_type=sa.VARCHAR(),
comment="Name of the LLM service",
existing_comment="Name of the LLM service provider",
existing_nullable=False,
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# Check the current state of the upgrade function in the migration file
head -30 backend/app/alembic/versions/042_add_llm_call_table.py

Repository: ProjectTech4DevAI/kaapi-backend

Length of output: 718


🏁 Script executed:

# Check other migration files to see if they have type hints
find backend/app/alembic/versions -name "*.py" -type f | head -5 | xargs -I {} sh -c 'echo "=== {} ===" && head -25 {}'

Repository: ProjectTech4DevAI/kaapi-backend

Length of output: 3370


🏁 Script executed:

# Check if the file has a downgrade() function as well
grep -n "def downgrade" backend/app/alembic/versions/042_add_llm_call_table.py

Repository: ProjectTech4DevAI/kaapi-backend

Length of output: 95


🏁 Script executed:

# Check the downgrade function and surrounding context
sed -n '164,170p' backend/app/alembic/versions/042_add_llm_call_table.py

Repository: ProjectTech4DevAI/kaapi-backend

Length of output: 316


Add return type hint to upgrade function.

Line 20 defines upgrade without a return type; add -> None to meet the typing requirement.

✏️ Suggested fix
-def upgrade():
+def upgrade() -> None:

Per coding guidelines, all **/*.py files must include type hints for function parameters and return values.

🤖 Prompt for AI Agents
In `@backend/app/alembic/versions/042_add_llm_call_table.py` around lines 20 -
160, The function definition for upgrade is missing a return type; update the
def upgrade() signature to include an explicit return type (-> None) so it reads
def upgrade() -> None: and ensure the function body remains unchanged; apply the
same pattern for any Alembic migration functions (e.g., downgrade) in this file
if present to satisfy typing guidelines.

Comment on lines +23 to +31
def __init__(self, client: genai.Client):
"""Initialize Google AI provider with client.

Args:
client: Google AI client instance
"""
super().__init__(client)
self.client = client

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add missing -> None return annotation on __init__.

The constructor lacks a return type annotation.

🔧 Proposed fix
-    def __init__(self, client: genai.Client):
+    def __init__(self, client: genai.Client) -> None:

As per coding guidelines: **/*.py: Always add type hints to all function parameters and return values in Python code.

🤖 Prompt for AI Agents
In `@backend/app/services/llm/providers/gai.py` around lines 23 - 31, Update the
constructor signature for __init__ in backend/app/services/llm/providers/gai.py
to include an explicit return type annotation -> None (i.e., change "def
__init__(self, client: genai.Client):" to "def __init__(self, client:
genai.Client) -> None:") and keep the body unchanged (retain
super().__init__(client) and self.client = client) so the initializer conforms
to the project's type-hinting guidelines.

Comment on lines 90 to 95
forced_trascription_text="Only return transcribed text and no other text."
# Merge user instructions with language instructions
if instructions:
merged_instruction = f"{instructions}. {lang_instruction}. {forced_trascription_text}"
else:
merged_instruction = f"{lang_instruction}. {forced_trascription_text}"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Resolve Black formatting failure in this file.

CI reports Black reformatting; for example, Line 90 is not Black-compliant. Please run Black (or pre-commit) to fix formatting.

🧹 Example fix (Black will handle all formatting)
-forced_trascription_text="Only return transcribed text and no other text."
+forced_trascription_text = "Only return transcribed text and no other text."
🤖 Prompt for AI Agents
In `@backend/app/services/llm/providers/gai.py` around lines 90 - 95, The file
fails Black formatting around the string assembly for forced_trascription_text
and merged_instruction; run Black (or pre-commit) to reformat the file so
spacing/line breaks match Black style, and ensure the variables
forced_trascription_text, merged_instruction, instructions, and lang_instruction
remain unchanged semantically while fixing formatting inconsistencies (e.g.,
string concatenation/spacing and line length) in the function that builds
merged_instruction.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Fix all issues with AI agents
In `@backend/app/alembic/versions/043_remove_enum_checks_llm_call_provider.py`:
- Around line 1-10: The migration docstring is misleading — it says "remove:enum
checks llm_call provider" while the migration only updates the column comment;
update the top docstring summary to accurately describe the change (e.g.,
"update llm_call provider column comment" or similar) and remove the unused
import sqlmodel.sql.sqltypes; ensure the revised docstring remains the
module-level triple-quoted string near the top and delete the unused import line
so only needed imports like from alembic import op and import sqlalchemy as sa
remain.

In `@backend/app/crud/llm.py`:
- Around line 143-146: The log call inside create_llm_call uses an undefined
name `provider`; change the logger.info to reference the actual variable
`original_provider` (the one used earlier in this scope) so the message uses
f"job_id={job_id}, provider={original_provider}, model={model}" and ensure
`original_provider` is in scope where db_llm_call.id is logged.
- Around line 181-192: The update block for the LLM call does not refresh the
record's updated_at timestamp; in the function that modifies db_llm_call (the
code that sets provider_response_id, content, usage, conversation_id and then
calls session.add/commit/refresh), explicitly set db_llm_call.updated_at to the
current UTC timestamp (e.g., datetime.utcnow() or timezone-aware equivalent)
before session.add and session.commit so the audit field is updated on every
change.

In `@backend/app/services/llm/jobs.py`:
- Line 205: Remove the unused local variable assignment to
user_sent_config_provider in backend/app/services/llm/jobs.py (the line
"user_sent_config_provider = \"\""); delete that line so the unused variable is
not defined (no other changes needed unless you actually intend to use this
variable elsewhere, in which case replace the assignment with the proper usage).
🧹 Nitpick comments (6)
backend/app/alembic/versions/043_remove_enum_checks_llm_call_provider.py (2)

20-30: Missing return type hint.

Per coding guidelines, all functions should have return type hints.

✏️ Proposed fix
-def upgrade():
+def upgrade() -> None:

33-43: Missing return type hint.

Per coding guidelines, all functions should have return type hints.

✏️ Proposed fix
-def downgrade():
+def downgrade() -> None:
backend/app/models/llm/request.py (2)

39-52: Consider making instructions optional for consistency.

STTLLMParams.instructions is required (str), while TextLLMParams.instructions is optional (str | None). If STT should allow empty instructions, consider making it optional with a default. Otherwise, document why STT requires instructions.

♻️ Proposed fix if optional is intended
 class STTLLMParams(SQLModel):
     model: str
-    instructions: str
+    instructions: str | None = None
     input_language: str | None = None

63-63: Consider using X | Y union syntax for type aliases.

Ruff (UP007) suggests using the modern union syntax. This is a minor style preference for Python 3.11+.

-KaapiLLMParams = Union[TextLLMParams, STTLLMParams, TTSLLMParams]
+KaapiLLMParams = TextLLMParams | STTLLMParams | TTSLLMParams
backend/app/crud/llm.py (1)

81-85: Redundant fallback for completion_type.

completion_config.type should always be defined (it's a required field in both NativeCompletionConfig and KaapiCompletionConfig). The getattr(completion_config.params, "type", "text") fallback is unreachable. Consider simplifying.

♻️ Proposed simplification
-    completion_type = completion_config.type or getattr(
-        completion_config.params, "type", "text"
-    )
+    completion_type = completion_config.type
backend/app/services/llm/jobs.py (1)

223-224: Remove unnecessary else: pass block.

This empty else branch serves no purpose.

                     request.request_metadata.setdefault("warnings", []).extend(warnings)
-                else:
-                    pass
             except Exception as e:

Comment on lines 1 to 10
"""remove:enum checks llm_call provider

Revision ID: 043
Revises: 042
Create Date: 2026-01-30 11:22:45.165543

"""
from alembic import op
import sqlalchemy as sa
import sqlmodel.sql.sqltypes
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Migration description is misleading.

The docstring states "remove:enum checks llm_call provider" but this migration only updates the column's comment metadata. No actual enum constraints are removed, no type changes occur. Consider renaming to accurately describe the change (e.g., "update llm_call provider column comment").

Also, sqlmodel.sql.sqltypes is imported but unused.

🧹 Proposed fix
-"""remove:enum checks llm_call provider
+"""update llm_call provider column comment

 Revision ID: 043
 Revises: 042
 Create Date: 2026-01-30 11:22:45.165543

 """
 from alembic import op
 import sqlalchemy as sa
-import sqlmodel.sql.sqltypes
🤖 Prompt for AI Agents
In `@backend/app/alembic/versions/043_remove_enum_checks_llm_call_provider.py`
around lines 1 - 10, The migration docstring is misleading — it says
"remove:enum checks llm_call provider" while the migration only updates the
column comment; update the top docstring summary to accurately describe the
change (e.g., "update llm_call provider column comment" or similar) and remove
the unused import sqlmodel.sql.sqltypes; ensure the revised docstring remains
the module-level triple-quoted string near the top and delete the unused import
line so only needed imports like from alembic import op and import sqlalchemy as
sa remain.

else:
config_blob = config.blob

user_sent_config_provider = ""
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove unused variable assignment.

user_sent_config_provider is assigned but never used (Ruff F841).

-            user_sent_config_provider = ""
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
user_sent_config_provider = ""
🧰 Tools
🪛 Ruff (0.14.14)

[error] 205-205: Local variable user_sent_config_provider is assigned to but never used

Remove assignment to unused variable user_sent_config_provider

(F841)

🤖 Prompt for AI Agents
In `@backend/app/services/llm/jobs.py` at line 205, Remove the unused local
variable assignment to user_sent_config_provider in
backend/app/services/llm/jobs.py (the line "user_sent_config_provider = \"\"");
delete that line so the unused variable is not defined (no other changes needed
unless you actually intend to use this variable elsewhere, in which case replace
the assignment with the proper usage).

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Fix all issues with AI agents
In `@backend/app/crud/llm.py`:
- Around line 81-99: The completion type derivation currently ignores a "type"
inside completion_config.params when params is a dict; update the logic that
sets completion_type (using resolved_config and completion_config) to check if
completion_config.params is a dict and, if so, use
completion_config.params.get("type") before falling back to
getattr(completion_config.params, "type", "text") or the default "text" so
STT/TTS are detected correctly and input_type/output_type (the variables
completion_type, input_type, output_type) are set appropriately.
- Around line 190-194: Replace the UTC-naive timestamp assignment that uses
datetime.datetime.now() by calling the shared now() helper used across CRUD
modules: update db_llm_call.updated_at = now() (the symbol to change is
db_llm_call.updated_at) so it matches the model defaults and other CRUD files;
ensure the existing imported now from app.core.util is used and remove the
datetime.datetime.now() usage in the llm CRUD update/commit flow (session.add,
session.commit, session.refresh remain unchanged).

In `@backend/app/services/llm/input_resolver.py`:
- Around line 66-85: The function resolve_audio_base64 currently decodes base64
permissively; update it to use strict validation by calling base64.b64decode
with validate=True (and handle binascii.Error or ValueError) so malformed input
is rejected, return the same error tuple shape with a clear message, and keep
the existing temp file writing logic (references: resolve_audio_base64,
get_file_extension, logger). Ensure the exception branch for decode returns "",
f"Invalid base64 audio data: {str(e)}" and that other exceptions while writing
still return the existing failure message.

In `@backend/app/services/llm/providers/gai.py`:
- Around line 114-116: The warning in GoogleAIProvider._execute_stt uses an
unnecessary f-string prefix for a static message; change the logger.warning call
to a plain string (remove the leading f from the quoted message) so the log call
is not an f-string without placeholders and the message remains identical.
🧹 Nitpick comments (2)
backend/app/crud/llm.py (1)

29-55: size_bytes is misleading for base64 strings.

len(query_input.data) measures base64 character count, not decoded byte length. Consider renaming the field to reflect actual semantics (or compute real byte length if that’s required downstream).

✏️ Suggested rename
-                "size_bytes": len(query_input.data),
+                "size_chars": len(query_input.data),
backend/app/tests/services/llm/test_mappers.py (1)

234-236: Introduce factory fixtures for param objects to reduce duplication.

The TestMapKaapiToGoogleParams class instantiates TextLLMParams/STTLLMParams six times with the common model="gemini-2.5-pro" default. A factory-style pytest fixture would centralize this default and eliminate boilerplate while supporting test-specific parameter overrides. This aligns with the test-fixture guideline.

Example factory fixture
+from collections.abc import Callable
+
+@pytest.fixture
+def text_params_factory() -> Callable[..., TextLLMParams]:
+    def _factory(**overrides: object) -> TextLLMParams:
+        return TextLLMParams(model="gemini-2.5-pro", **overrides)
+
+    return _factory
+
+@pytest.fixture
+def stt_params_factory() -> Callable[..., STTLLMParams]:
+    def _factory(**overrides: object) -> STTLLMParams:
+        return STTLLMParams(model="gemini-2.5-pro", **overrides)
+
+    return _factory

Per coding guidelines: Use factory pattern for test fixtures in backend/app/tests/.

Comment on lines +81 to +99
# Determine input/output types based on completion config type
completion_config = resolved_config.completion
completion_type = completion_config.type or getattr(
completion_config.params, "type", "text"
)

input_type: Literal["text", "audio", "image"]
output_type: Literal["text", "audio", "image"] | None

if completion_type == "stt":
input_type = "audio"
output_type = "text"
elif completion_type == "tts":
input_type = "text"
output_type = "audio"
else:
input_type = "text"
output_type = "text"

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fix completion type derivation when params is a dict.

If completion_config.type is missing and params is a dict with "type", the current logic falls back to "text" and ignores the dict value. That can misclassify STT/TTS and store incorrect input/output types.

🐛 Proposed fix
-    completion_type = completion_config.type or getattr(
-        completion_config.params, "type", "text"
-    )
+    completion_type = completion_config.type
+    if not completion_type:
+        if isinstance(completion_config.params, dict):
+            completion_type = completion_config.params.get("type", "text")
+        else:
+            completion_type = getattr(completion_config.params, "type", "text")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Determine input/output types based on completion config type
completion_config = resolved_config.completion
completion_type = completion_config.type or getattr(
completion_config.params, "type", "text"
)
input_type: Literal["text", "audio", "image"]
output_type: Literal["text", "audio", "image"] | None
if completion_type == "stt":
input_type = "audio"
output_type = "text"
elif completion_type == "tts":
input_type = "text"
output_type = "audio"
else:
input_type = "text"
output_type = "text"
# Determine input/output types based on completion config type
completion_config = resolved_config.completion
completion_type = completion_config.type
if not completion_type:
if isinstance(completion_config.params, dict):
completion_type = completion_config.params.get("type", "text")
else:
completion_type = getattr(completion_config.params, "type", "text")
input_type: Literal["text", "audio", "image"]
output_type: Literal["text", "audio", "image"] | None
if completion_type == "stt":
input_type = "audio"
output_type = "text"
elif completion_type == "tts":
input_type = "text"
output_type = "audio"
else:
input_type = "text"
output_type = "text"
🤖 Prompt for AI Agents
In `@backend/app/crud/llm.py` around lines 81 - 99, The completion type derivation
currently ignores a "type" inside completion_config.params when params is a
dict; update the logic that sets completion_type (using resolved_config and
completion_config) to check if completion_config.params is a dict and, if so,
use completion_config.params.get("type") before falling back to
getattr(completion_config.params, "type", "text") or the default "text" so
STT/TTS are detected correctly and input_type/output_type (the variables
completion_type, input_type, output_type) are set appropriately.

Comment on lines +190 to +194
db_llm_call.updated_at = datetime.datetime.now()

session.add(db_llm_call)
session.commit()
session.refresh(db_llm_call)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, let's examine the target file and check the imports
head -30 backend/app/crud/llm.py

Repository: ProjectTech4DevAI/kaapi-backend

Length of output: 755


🏁 Script executed:

# Check lines around 190-194 to see the context
sed -n '180,200p' backend/app/crud/llm.py

Repository: ProjectTech4DevAI/kaapi-backend

Length of output: 642


🏁 Script executed:

# Find where now() is defined
fd -e py --type f | xargs rg -l "^def now\(|^async def now\(" | head -20

Repository: ProjectTech4DevAI/kaapi-backend

Length of output: 99


🏁 Script executed:

# Check the imports at the top of llm.py more comprehensively
head -50 backend/app/crud/llm.py

Repository: ProjectTech4DevAI/kaapi-backend

Length of output: 1412


🏁 Script executed:

# Search for usage of now() vs datetime.datetime.now() in CRUD files
rg "\.now\(\)" backend/app/crud/ -A 1 -B 1

Repository: ProjectTech4DevAI/kaapi-backend

Length of output: 201


🏁 Script executed:

# Check if now() is imported from util in other CRUD files
rg "from.*util.*import.*now|from.*util.*import \*" backend/app/crud/

Repository: ProjectTech4DevAI/kaapi-backend

Length of output: 1379


Use now() for UTC-naive consistency with model defaults.

The file already imports now from app.core.util, and all other CRUD files use this function for setting timestamps. The code is currently the only exception, using datetime.datetime.now() instead.

🔧 Proposed fix
-    db_llm_call.updated_at = datetime.datetime.now()
+    db_llm_call.updated_at = now()
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
db_llm_call.updated_at = datetime.datetime.now()
session.add(db_llm_call)
session.commit()
session.refresh(db_llm_call)
db_llm_call.updated_at = now()
session.add(db_llm_call)
session.commit()
session.refresh(db_llm_call)
🤖 Prompt for AI Agents
In `@backend/app/crud/llm.py` around lines 190 - 194, Replace the UTC-naive
timestamp assignment that uses datetime.datetime.now() by calling the shared
now() helper used across CRUD modules: update db_llm_call.updated_at = now()
(the symbol to change is db_llm_call.updated_at) so it matches the model
defaults and other CRUD files; ensure the existing imported now from
app.core.util is used and remove the datetime.datetime.now() usage in the llm
CRUD update/commit flow (session.add, session.commit, session.refresh remain
unchanged).

Comment on lines +66 to +85
def resolve_audio_base64(data: str, mime_type: str) -> tuple[str, str | None]:
"""Decode base64 audio and write to temp file. Returns (file_path, error)."""
try:
audio_bytes = base64.b64decode(data)
except Exception as e:
return "", f"Invalid base64 audio data: {str(e)}"

ext = get_file_extension(mime_type)
try:
with tempfile.NamedTemporaryFile(
suffix=ext, delete=False, prefix="audio_"
) as tmp:
tmp.write(audio_bytes)
temp_path = tmp.name

logger.info(f"[resolve_audio_base64] Wrote audio to temp file: {temp_path}")
return temp_path, None
except Exception as e:
return "", f"Failed to write audio to temp file: {str(e)}"

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🌐 Web query:

Python base64.b64decode validate parameter

💡 Result:

base64.b64decode(s, altchars=None, validate=False) has a validate flag that controls how strictly the input is checked:

  • validate=False (default): any characters not in the Base64 alphabet (or the altchars alphabet) are silently discarded before padding is checked. Padding problems still raise binascii.Error. [1]
  • validate=True: any non‑alphabet characters in the input cause an immediate binascii.Error (i.e., “strict” Base64). Padding problems also raise binascii.Error. [1]

If you want strict checking, use:

import base64
decoded = base64.b64decode(data, validate=True)

Refs: Python base64 docs; see also the note pointing to binascii.a2b_base64() for strict checking behavior details. [2]

Sources: [1] [2]


🏁 Script executed:

# Check the actual file in the repository
find . -type f -name "input_resolver.py" | head -5

Repository: ProjectTech4DevAI/kaapi-backend

Length of output: 119


🏁 Script executed:

# Read the actual file to verify the code snippet
cat -n ./backend/app/services/llm/input_resolver.py | sed -n '66,85p'

Repository: ProjectTech4DevAI/kaapi-backend

Length of output: 958


🏁 Script executed:

# Check the full context and imports in the file
cat -n ./backend/app/services/llm/input_resolver.py | head -100

Repository: ProjectTech4DevAI/kaapi-backend

Length of output: 3800


Add strict base64 validation to reject malformed input.

base64.b64decode() without validate=True silently discards non-base64 characters before padding validation, which can produce unintended output from corrupted data. Using strict validation ensures only valid base64 is accepted.

Suggested change
-        audio_bytes = base64.b64decode(data)
+        audio_bytes = base64.b64decode(data, validate=True)
🤖 Prompt for AI Agents
In `@backend/app/services/llm/input_resolver.py` around lines 66 - 85, The
function resolve_audio_base64 currently decodes base64 permissively; update it
to use strict validation by calling base64.b64decode with validate=True (and
handle binascii.Error or ValueError) so malformed input is rejected, return the
same error tuple shape with a clear message, and keep the existing temp file
writing logic (references: resolve_audio_base64, get_file_extension, logger).
Ensure the exception branch for decode returns "", f"Invalid base64 audio data:
{str(e)}" and that other exceptions while writing still return the existing
failure message.

Comment on lines +114 to +116
logger.warning(
f"[GoogleAIProvider._execute_stt] Response missing usage_metadata, using zeros"
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove unused f-string prefix.

Static analysis reports an f-string without placeholders.

Proposed fix
-            logger.warning(
-                f"[GoogleAIProvider._execute_stt] Response missing usage_metadata, using zeros"
-            )
+            logger.warning(
+                "[GoogleAIProvider._execute_stt] Response missing usage_metadata, using zeros"
+            )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
logger.warning(
f"[GoogleAIProvider._execute_stt] Response missing usage_metadata, using zeros"
)
logger.warning(
"[GoogleAIProvider._execute_stt] Response missing usage_metadata, using zeros"
)
🧰 Tools
🪛 Ruff (0.14.14)

[error] 115-115: f-string without any placeholders

Remove extraneous f prefix

(F541)

🤖 Prompt for AI Agents
In `@backend/app/services/llm/providers/gai.py` around lines 114 - 116, The
warning in GoogleAIProvider._execute_stt uses an unnecessary f-string prefix for
a static message; change the logger.warning call to a plain string (remove the
leading f from the quoted message) so the log call is not an f-string without
placeholders and the message remains identical.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
backend/app/tests/api/routes/test_evaluation.py (2)

497-497: ⚠️ Potential issue | 🟡 Minor

Add type hints to function parameters and return value.

This test method is missing type hints on client and valid_csv_content parameters, and lacks a return type annotation.

Proposed fix
-    def test_upload_without_authentication(self, client, valid_csv_content):
+    def test_upload_without_authentication(self, client: TestClient, valid_csv_content: str) -> None:

As per coding guidelines: "Always add type hints to all function parameters and return values in Python code"


580-582: ⚠️ Potential issue | 🟡 Minor

Add type hints to function parameters and return value.

This test method is missing type hints on client and sample_evaluation_config parameters, and lacks a return type annotation.

Proposed fix
-    def test_start_batch_evaluation_without_authentication(
-        self, client, sample_evaluation_config
-    ):
+    def test_start_batch_evaluation_without_authentication(
+        self, client: TestClient, sample_evaluation_config: dict[str, Any]
+    ) -> None:

As per coding guidelines: "Always add type hints to all function parameters and return values in Python code"

🤖 Fix all issues with AI agents
In `@backend/app/tests/services/llm/providers/test_gai.py`:
- Around line 83-249: The test functions in test_gai.py (e.g.,
test_stt_success_with_auto_language, test_stt_with_specific_input_language,
test_stt_with_translation, test_stt_with_custom_instructions,
test_stt_with_include_provider_raw_response, test_stt_missing_model_parameter,
test_stt_with_type_error, test_stt_with_generic_exception,
test_stt_with_invalid_input_type, test_stt_with_valid_file_path) lack explicit
type hints; update each function signature to annotate all parameters (e.g.,
provider, mock_client, stt_config, query_params, invalid_input) with appropriate
types or typing.Any and add a return type -> None for each test method to comply
with the project's typing guideline. Ensure imports for typing.Any or specific
test fixtures types are added if needed and keep names unchanged so pytest
discovery still works.
- Around line 52-81: Convert the fixtures mock_client, provider, stt_config, and
query_params to factory-pattern fixtures that return callables (e.g., def
mock_client() -> Callable[[], MagicMock]) which produce configured instances on
demand; add proper type hints for each fixture and the callable return types
(use MagicMock for the mocked client, GoogleAIProvider for provider factory,
NativeCompletionConfig for stt_config factory, and QueryParams for query_params
factory), and update usages in tests to call the returned factory to get
instances rather than relying on shared fixture objects; ensure the factories
still set mock_file.name and client.files.upload.return_value as before and
preserve the provider creation via GoogleAIProvider(client=...) while typing the
parameters and return values.
🧹 Nitpick comments (2)
backend/app/crud/evaluations/core.py (1)

445-450: Add defensive checks for config.completion and config.completion.params before dict access.

The code assumes config.completion and config.completion.params are non-None. If the resolved config has a None completion or params, line 446 will raise an AttributeError rather than a clear ValueError, making debugging harder.

🛡️ Proposed fix to add defensive checks
+    if config.completion is None:
+        raise ValueError(
+            f"Config for evaluation {eval_run.id} has no completion block"
+        )
+    if config.completion.params is None:
+        raise ValueError(
+            f"Config for evaluation {eval_run.id} has no completion params"
+        )
     # params is a dict, not a Pydantic model, so use dict access
     model = config.completion.params.get("model")
     if not model:
         raise ValueError(
             f"Config for evaluation {eval_run.id} does not contain a 'model' parameter"
         )
backend/app/tests/api/routes/test_evaluation.py (1)

735-748: Fixture defined inside class without self is unconventional.

The @pytest.fixture without self inside a test class is functional (pytest handles parameter injection), but it's an uncommon pattern that can cause confusion. Consider either keeping self as the first parameter or moving the fixture to module level if it's shared across multiple test classes.

This same pattern applies to the identical fixtures at lines 1078-1090 and 1138-1150.

Comment on lines +52 to +81
@pytest.fixture
def mock_client(self):
"""Create a mock Google AI client."""
client = MagicMock()
# Mock file upload
mock_file = MagicMock()
mock_file.name = "test_audio.wav"
client.files.upload.return_value = mock_file
return client

@pytest.fixture
def provider(self, mock_client):
"""Create a GoogleAIProvider instance with mock client."""
return GoogleAIProvider(client=mock_client)

@pytest.fixture
def stt_config(self):
"""Create a basic STT completion config."""
return NativeCompletionConfig(
provider="google-native",
type="stt",
params={
"model": "gemini-2.5-pro",
},
)

@pytest.fixture
def query_params(self):
"""Create basic query parameters."""
return QueryParams(input="Test audio input")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Convert fixtures to factory-pattern and add type hints.

All fixtures in backend/app/tests/** should use a factory pattern, and their params/returns need type hints.

♻️ Example factory fixture (apply pattern to all fixtures in this module)
 `@pytest.fixture`
-def stt_config(self):
-    """Create a basic STT completion config."""
-    return NativeCompletionConfig(
-        provider="google-native",
-        type="stt",
-        params={
-            "model": "gemini-2.5-pro",
-        },
-    )
+def stt_config() -> callable:
+    """Factory for STT completion configs."""
+    def _factory(**overrides) -> NativeCompletionConfig:
+        base = {
+            "provider": "google-native",
+            "type": "stt",
+            "params": {"model": "gemini-2.5-pro"},
+        }
+        base.update(overrides)
+        return NativeCompletionConfig(**base)
+    return _factory

As per coding guidelines, “backend/app/tests/**/*.py: Use factory pattern for test fixtures in backend/app/tests/”.

🤖 Prompt for AI Agents
In `@backend/app/tests/services/llm/providers/test_gai.py` around lines 52 - 81,
Convert the fixtures mock_client, provider, stt_config, and query_params to
factory-pattern fixtures that return callables (e.g., def mock_client() ->
Callable[[], MagicMock]) which produce configured instances on demand; add
proper type hints for each fixture and the callable return types (use MagicMock
for the mocked client, GoogleAIProvider for provider factory,
NativeCompletionConfig for stt_config factory, and QueryParams for query_params
factory), and update usages in tests to call the returned factory to get
instances rather than relying on shared fixture objects; ensure the factories
still set mock_file.name and client.files.upload.return_value as before and
preserve the provider creation via GoogleAIProvider(client=...) while typing the
parameters and return values.

Comment on lines +83 to +249
def test_stt_success_with_auto_language(
self, provider, mock_client, stt_config, query_params
):
"""Test successful STT execution with auto language detection."""
mock_response = mock_google_response(text="Hello world")
mock_client.models.generate_content.return_value = mock_response

result, error = provider.execute(stt_config, query_params, "/path/to/audio.wav")

assert error is None
assert result is not None
assert result.response.output.text == "Hello world"
assert result.response.model == "gemini-2.5-pro"
assert result.response.provider == "google-native"
assert result.usage.input_tokens == 50
assert result.usage.output_tokens == 100
assert result.usage.total_tokens == 150

# Verify file upload and content generation
mock_client.files.upload.assert_called_once_with(file="/path/to/audio.wav")
mock_client.models.generate_content.assert_called_once()

# Verify instruction contains auto-detect
call_args = mock_client.models.generate_content.call_args
assert "Detect the spoken language automatically" in call_args[1]["contents"][0]

def test_stt_with_specific_input_language(
self, provider, mock_client, stt_config, query_params
):
"""Test STT with specific input language."""
stt_config.params["input_language"] = "English"

mock_response = mock_google_response(text="Transcribed English text")
mock_client.models.generate_content.return_value = mock_response

result, error = provider.execute(stt_config, query_params, "/path/to/audio.wav")

assert error is None
assert result is not None

# Verify instruction contains specific language
call_args = mock_client.models.generate_content.call_args
assert "Transcribe the audio from English" in call_args[1]["contents"][0]

def test_stt_with_translation(
self, provider, mock_client, stt_config, query_params
):
"""Test STT with translation to different output language."""
stt_config.params["input_language"] = "Spanish"
stt_config.params["output_language"] = "English"

mock_response = mock_google_response(text="Translated text")
mock_client.models.generate_content.return_value = mock_response

result, error = provider.execute(stt_config, query_params, "/path/to/audio.wav")

assert error is None
assert result is not None

# Verify instruction contains translation
call_args = mock_client.models.generate_content.call_args
instruction = call_args[1]["contents"][0]
assert "Transcribe the audio from Spanish" in instruction
assert "translate to English" in instruction

def test_stt_with_custom_instructions(
self, provider, mock_client, stt_config, query_params
):
"""Test STT with custom instructions."""
stt_config.params["instructions"] = "Include timestamps"

mock_response = mock_google_response(text="Transcribed with timestamps")
mock_client.models.generate_content.return_value = mock_response

result, error = provider.execute(stt_config, query_params, "/path/to/audio.wav")

assert error is None
assert result is not None

# Verify custom instructions are included
call_args = mock_client.models.generate_content.call_args
instruction = call_args[1]["contents"][0]
assert "Include timestamps" in instruction

def test_stt_with_include_provider_raw_response(
self, provider, mock_client, stt_config, query_params
):
"""Test STT with include_provider_raw_response=True."""
mock_response = mock_google_response(text="Raw response test")
mock_client.models.generate_content.return_value = mock_response

result, error = provider.execute(
stt_config,
query_params,
"/path/to/audio.wav",
include_provider_raw_response=True,
)

assert error is None
assert result is not None
assert result.provider_raw_response is not None
assert isinstance(result.provider_raw_response, dict)
assert result.provider_raw_response["text"] == "Raw response test"

def test_stt_missing_model_parameter(self, provider, mock_client, query_params):
"""Test error handling when model parameter is missing."""
stt_config = NativeCompletionConfig(
provider="google-native",
type="stt",
params={}, # Missing model
)

result, error = provider.execute(stt_config, query_params, "/path/to/audio.wav")

assert result is None
assert error is not None
assert "Missing 'model' in native params" in error

def test_stt_with_type_error(self, provider, mock_client, stt_config, query_params):
"""Test handling of TypeError (invalid parameters)."""
mock_client.models.generate_content.side_effect = TypeError(
"unexpected keyword argument 'invalid_param'"
)

result, error = provider.execute(stt_config, query_params, "/path/to/audio.wav")

assert result is None
assert error is not None
assert "Invalid or unexpected parameter in Config" in error

def test_stt_with_generic_exception(
self, provider, mock_client, stt_config, query_params
):
"""Test handling of unexpected exceptions."""
mock_client.files.upload.side_effect = Exception("File upload failed")

result, error = provider.execute(stt_config, query_params, "/path/to/audio.wav")

assert result is None
assert error is not None
assert "Unexpected error occurred" in error

def test_stt_with_invalid_input_type(
self, provider, mock_client, stt_config, query_params
):
"""Test STT execution with invalid input type (non-string)."""
# Pass a dict instead of a string path
invalid_input = {"invalid": "data"}

result, error = provider.execute(stt_config, query_params, invalid_input)

assert result is None
assert error is not None
assert "STT requires file path as string" in error

def test_stt_with_valid_file_path(
self, provider, mock_client, stt_config, query_params
):
"""Test STT execution with valid file path string."""
mock_response = mock_google_response(text="Valid transcription")
mock_client.models.generate_content.return_value = mock_response

result, error = provider.execute(stt_config, query_params, "/path/to/audio.wav")

assert error is None
assert result is not None
assert result.response.output.text == "Valid transcription"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add type hints to test method parameters and return types.

All test methods should declare parameter and return types (-> None).

✅ Example (apply to all test methods)
-    def test_stt_success_with_auto_language(
-        self, provider, mock_client, stt_config, query_params
-    ):
+    def test_stt_success_with_auto_language(
+        self,
+        provider: GoogleAIProvider,
+        mock_client: MagicMock,
+        stt_config: NativeCompletionConfig,
+        query_params: QueryParams,
+    ) -> None:

As per coding guidelines, “**/*.py: Always add type hints to all function parameters and return values in Python code”.

🤖 Prompt for AI Agents
In `@backend/app/tests/services/llm/providers/test_gai.py` around lines 83 - 249,
The test functions in test_gai.py (e.g., test_stt_success_with_auto_language,
test_stt_with_specific_input_language, test_stt_with_translation,
test_stt_with_custom_instructions, test_stt_with_include_provider_raw_response,
test_stt_missing_model_parameter, test_stt_with_type_error,
test_stt_with_generic_exception, test_stt_with_invalid_input_type,
test_stt_with_valid_file_path) lack explicit type hints; update each function
signature to annotate all parameters (e.g., provider, mock_client, stt_config,
query_params, invalid_input) with appropriate types or typing.Any and add a
return type -> None for each test method to comply with the project's typing
guideline. Ensure imports for typing.Any or specific test fixtures types are
added if needed and keep names unchanged so pytest discovery still works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request ready-for-review

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

3 participants