Analysis of `engines.py`

This module implements the AI engine abstraction layer — the core of the multi-AI system. Here's a thorough review.

---

## Architecture

```
AIEngine (ABC)
├── GeminiEngine    — Google genai SDK
├── OpenAIEngine    — OpenAI-compatible (GPT, Grok, Local/Ollama)
└── ClaudeEngine    — Anthropic SDK
```

All three share a common pattern:
```
call() → trim history → build messages → loop with auto-continue → update history → return
```

---

## Key Design Pattern: Auto-Continue

Every engine implements a **truncation recovery loop** — if the model hits its output token limit, the engine automatically re-prompts to continue. This is a genuinely useful feature, but the implementation has issues.

---

## Critical Issues

### 1. **Duplicate `initialize_engines()` — config.py vs engines.py**

There are **two** `initialize_engines()` functions: one in `config.py` and one in `engines.py`. They do the same thing differently:

| Aspect | `config.py` version | `engines.py` version |
|---|---|---|
| Gemini SDK | `google.generativeai` (old) | `google.genai` (new) |
| GeminiEngine args | 2 args (no client) | 3 args (with client) |
| max_tokens_key | Not passed for grok/local | Passed per-engine |

**This is a conflict.** Whichever runs last wins, but the `config.py` version would **crash** because `GeminiEngine.__init__` requires 3 arguments (name, model_name, client), yet `config.py` only passes 2.

### 2. **Gemini system prompt sent as `"role": "model"`**

```python
if self.system_prompt:
    contents.append({"role": "model", "parts": [{"text": self.system_prompt}]})
```

This pretends the system prompt is a **model response**, which is semantically wrong. The Gemini API supports a dedicated `system_instruction` parameter in the generation config. This hack can cause the model to behave unpredictably.

**Fix:**
```python
response = self.client.models.generate_content(
    model=self.model_name,
    contents=contents,  # without system prompt
    config={
        "max_output_tokens": self.max_output_tokens,
        "system_instruction": self.system_prompt or None,
    },
)
```

### 3. **GeminiEngine error handling doesn't wrap in `AIError`**

```python
# GeminiEngine.call()
except Exception as e:
    logger.error(f"Gemini Error: {e}")
    raise  # ← raw exception

# OpenAIEngine.call()
except Exception as e:
    logger.error(f"{self.name} API Error: {e}")
    raise AIError(f"{self.name} error: {e}")  # ← wrapped
```

This inconsistency means callers can't uniformly catch `AIError` for all engines.

### 4. **Gemini auto-continue heuristic is fragile**

```python
if answer_chunk.count("```") % 2 == 1:
    return True
if answer_chunk.rstrip().endswith((",", ":", "(", "[", "{")):
    return True
```

These heuristics check `answer_chunk` (the **latest** chunk), not `full_answer`. On round 2+, this only examines the continuation fragment, not the accumulated text. An odd number of triple-backticks in a chunk doesn't necessarily mean truncation — the opening backtick could be in a previous chunk.

### 5. **`_trim_history` is called twice per `call()`**

```python
def call(self, prompt: str) -> str:
    self._trim_history()       # ← before building messages
    # ... API call ...
    self.history.append(...)
    self.history.append(...)
    self._trim_history()       # ← after appending
```

The first trim is redundant since history is always trimmed at the end of each call. Alternatively, if `scrub()` or `load_persona()` are interleaved, the first trim is still unnecessary since those methods reset history entirely.

---

## Moderate Issues

### 6. **Decaying `tail_chars` only in Gemini**

```python
# GeminiEngine only:
eff_tail_chars = max(300, int(tail_chars * (0.8 ** (round_idx - 1))))
```

OpenAI and Claude use a fixed `tail_chars` across rounds. This inconsistency should be intentional and documented, or unified.

### 7. **No timeout configuration**

None of the API calls specify timeouts. A hung API call will block indefinitely. All three SDKs support `timeout` parameters.

### 8. **Claude passes empty string for system when no persona is set**

```python
system=self.system_prompt if self.system_prompt else "",
```

The Anthropic API accepts `system` as optional. Passing an empty string is wasteful at best and could subtly influence behavior. Better:

```python
kwargs = {"model": self.model_name, "max_tokens": self.max_tokens, "messages": messages}
if self.system_prompt:
    kwargs["system"] = self.system_prompt
response = self.client.messages.create(**kwargs)
```

---

## Suggested Refactoring: Extract the Auto-Continue Loop

All three engines duplicate the same loop structure. This can be extracted:

```python
class AIEngine(ABC):
    def call(self, prompt: str) -> str:
        self._trim_history()
        messages = self._build_messages(prompt)
        full_answer = self._auto_continue_loop(messages)
        self.history.append({"role": "user", "content": prompt})
        self.history.append({"role": "assistant", "content": full_answer})
        self._trim_history()
        return full_answer

    def _auto_continue_loop(self, messages) -> str:
        full_answer = ""
        for round_idx in range(1, self._max_rounds + 1):
            chunk, truncated = self._single_call(messages)
            full_answer += chunk
            if not truncated:
                break
            messages = self._append_continue(messages, chunk, full_answer, round_idx)
        else:
            full_answer += "\n\n[TRUNCATED: auto-continue limit reached]\n"
        return full_answer

    @abstractmethod
    def _build_messages(self, prompt: str) -> list: ...

    @abstractmethod
    def _single_call(self, messages) -> tuple[str, bool]: ...

    @abstractmethod
    def _append_continue(self, messages, chunk, full, round_idx) -> list: ...
```

This would eliminate ~60% of the duplicated code across the three engine classes.

---

## Summary

| Severity | Issue | Location |
|----------|-------|----------|
| 🔴 Critical | Duplicate & conflicting `initialize_engines()` | config.py vs engines.py |
| 🔴 Critical | System prompt as model role in Gemini | `GeminiEngine.call()` |
| 🟠 High | Inconsistent error wrapping (AIError) | `GeminiEngine.call()` |
| 🟠 High | Fragile truncation heuristics on chunk, not full | `_hit_output_limit()` |
| 🟡 Medium | No API timeouts | All engines |
| 🟡 Medium | Duplicated auto-continue loop | All three `call()` methods |
| 🟢 Low | Double `_trim_history()` calls | All engines |
| 🟢 Low | Empty string system prompt to Claude | `ClaudeEngine.call()` |

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Analysis of `engines.py` #29

Architecture

Key Design Pattern: Auto-Continue

Critical Issues

1. Duplicate `initialize_engines()` — config.py vs engines.py

2. Gemini system prompt sent as `"role": "model"`

3. GeminiEngine error handling doesn't wrap in `AIError`

4. Gemini auto-continue heuristic is fragile

5. `_trim_history` is called twice per `call()`

Moderate Issues

6. Decaying `tail_chars` only in Gemini

7. No timeout configuration

8. Claude passes empty string for system when no persona is set

Suggested Refactoring: Extract the Auto-Continue Loop

Summary

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Aspect	`config.py` version	`engines.py` version
Gemini SDK	`google.generativeai` (old)	`google.genai` (new)
GeminiEngine args	2 args (no client)	3 args (with client)
max_tokens_key	Not passed for grok/local	Passed per-engine

Severity	Issue	Location
🔴 Critical	Duplicate & conflicting `initialize_engines()`	config.py vs engines.py
🔴 Critical	System prompt as model role in Gemini	`GeminiEngine.call()`
🟠 High	Inconsistent error wrapping (AIError)	`GeminiEngine.call()`
🟠 High	Fragile truncation heuristics on chunk, not full	`_hit_output_limit()`
🟡 Medium	No API timeouts	All engines
🟡 Medium	Duplicated auto-continue loop	All three `call()` methods
🟢 Low	Double `_trim_history()` calls	All engines
🟢 Low	Empty string system prompt to Claude	`ClaudeEngine.call()`

Analysis of engines.py #29

Description

Architecture

Key Design Pattern: Auto-Continue

Critical Issues

1. Duplicate initialize_engines() — config.py vs engines.py

2. Gemini system prompt sent as "role": "model"

3. GeminiEngine error handling doesn't wrap in AIError

4. Gemini auto-continue heuristic is fragile

5. _trim_history is called twice per call()

Moderate Issues

6. Decaying tail_chars only in Gemini

7. No timeout configuration

8. Claude passes empty string for system when no persona is set

Suggested Refactoring: Extract the Auto-Continue Loop

Summary

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Analysis of `engines.py` #29

1. Duplicate `initialize_engines()` — config.py vs engines.py

2. Gemini system prompt sent as `"role": "model"`

3. GeminiEngine error handling doesn't wrap in `AIError`

5. `_trim_history` is called twice per `call()`

6. Decaying `tail_chars` only in Gemini