Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -43,3 +43,20 @@ jobs:
- name: Test
run: go test ./...

bootstrap-integration:
runs-on: ubuntu-latest
Comment on lines +46 to +47
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add timeout-minutes to prevent CI hangs on stuck tests

The bootstrap-integration job runs 50+ tests with filesystem and SQLite operations but has no job-level timeout-minutes guard. If any test deadlocks or blocks indefinitely, the job will consume runner time until GitHub's 6-hour timeout, blocking other CI runs and wasting minutes.

Suggested change
bootstrap-integration:
runs-on: ubuntu-latest
bootstrap-integration:
runs-on: ubuntu-latest
timeout-minutes: 10

The -failfast flag in the test command (line 58) handles fast failure within the binary, but cannot interrupt a process that has already hung.

Prompt To Fix With AI
This is a comment left during a code review.
Path: .github/workflows/ci.yml
Line: 46-47

Comment:
**Add `timeout-minutes` to prevent CI hangs on stuck tests**

The `bootstrap-integration` job runs 50+ tests with filesystem and SQLite operations but has no job-level `timeout-minutes` guard. If any test deadlocks or blocks indefinitely, the job will consume runner time until GitHub's 6-hour timeout, blocking other CI runs and wasting minutes.

```suggestion
  bootstrap-integration:
    runs-on: ubuntu-latest
    timeout-minutes: 10
```

The `-failfast` flag in the test command (line 58) handles fast failure *within* the binary, but cannot interrupt a process that has already hung.

How can I resolve this? If you propose a fix, please make it concise.

Fix in Claude Code

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 8255ebe — added timeout-minutes: 10.

timeout-minutes: 10
steps:
- uses: actions/checkout@v4

- uses: actions/setup-go@v5
with:
go-version: '1.24.x'
cache: true

- name: Bootstrap regression tests
run: |
go test -v -count=1 -failfast \
-run "TestBootstrapRepro|TestClaudeDriftDetection|TestKnowledgeLinking_NoFK|TestSubrepoMetadata|TestDocIngestion|TestRootPathResolution|TestIsMonorepoDetection|TestWorkspaceRootSelection|TestGate3_Enforcement|TestParseJSONResponse_Hallucination" \
./...

7 changes: 7 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,13 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### Fixed

- **RootPath resolution**: Reject `MarkerNone` contexts in `GetMemoryBasePath` to prevent accidental writes to `~/.taskwing/memory.db`. Also reject `.taskwing` markers above multi-repo workspaces during detection walk-up. (`TestRootPathResolution`, `TestBootstrapRepro_RootPathResolvesToHome`)
- **FK constraint failures**: `LinkNodes` now pre-checks node existence before INSERT to avoid SQLite error 787. Duplicate edges handled gracefully. (`TestKnowledgeLinking_NoFK`)
- **IsMonorepo misclassification**: `Detect()` now checks `hasNestedProjects()` in the `MarkerNone` fallback, so multi-repo workspaces are correctly classified. Resolves disagreement between `Detect()` and `DetectWorkspace()`. (`TestIsMonorepoDetection`, `TestBootstrapRepro_IsMonorepoMisclassification`)
- **Zero docs loaded**: Added `LoadForServices` to `DocLoader` for multi-repo workspaces. Wired into `RunDeterministicBootstrap` via workspace auto-detection. (`TestDocIngestion`, `TestSubrepoMetadataExtraction`)
- **Sub-repo metadata**: Verified per-repo workspace context in node storage with proper isolation and cross-workspace linking. (`TestSubrepoMetadataPresent`)
- **Claude MCP drift**: Added filesystem-based drift detection tests with evidence traceability and Gate 3 consent enforcement for global mutations. (`TestClaudeDriftDetection`)
- **Hallucinated findings**: Gate 3 enforcement in `NewFindingWithEvidence` — findings without evidence start as "skipped". Added `HasEvidence()` and `NeedsHumanVerification()` to `Finding`. (`TestGate3_Enforcement`, `TestParseJSONResponse_Hallucination`)
- Priority scheduling semantics corrected (lower numeric priority executes first).
- Unknown slash subcommands now fail explicitly instead of silently falling back.
- MCP plan action descriptions aligned with implemented behavior.
Expand Down
26 changes: 12 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -99,14 +99,14 @@ taskwing goal "Add Stripe billing"
<!-- TASKWING_TOOLS_END -->

<!-- TASKWING_LEGAL_START -->
<sub>Brand names and logos are trademarks of their respective owners; usage here indicates compatibility, not endorsement.</sub>
Brand names and logos are trademarks of their respective owners; usage here indicates compatibility, not endorsement.
<!-- TASKWING_LEGAL_END -->

## MCP Tools

<!-- TASKWING_MCP_TOOLS_START -->
| Tool | Description |
|:-----|:------------|
|------|-------------|
| `ask` | Search project knowledge (decisions, patterns, constraints) |
| `task` | Unified task lifecycle (`next`, `current`, `start`, `complete`) |
| `plan` | Plan management (`clarify`, `decompose`, `expand`, `generate`, `finalize`, `audit`) |
Expand Down Expand Up @@ -149,18 +149,16 @@ Once connected, use these slash commands from your AI assistant:
## Core Commands

<!-- TASKWING_COMMANDS_START -->
| Command | Description |
|:--------|:------------|
| `taskwing bootstrap` | Extract architecture from your codebase |
| `taskwing goal "<goal>"` | Create and activate a plan from a goal |
| `taskwing ask "<query>"` | Query project knowledge |
| `taskwing task` | Manage execution tasks |
| `taskwing plan status` | View current plan progress |
| `taskwing slash` | Output slash command prompts for AI tools |
| `taskwing mcp` | Start the MCP server |
| `taskwing doctor` | Health check for project memory |
| `taskwing config` | Configure LLM provider and settings |
| `taskwing start` | Start API/watch/dashboard services |
- `taskwing bootstrap`
- `taskwing goal "<goal>"`
- `taskwing ask "<query>"`
- `taskwing task`
- `taskwing plan status`
- `taskwing slash`
- `taskwing mcp`
- `taskwing doctor`
- `taskwing config`
- `taskwing start`
<!-- TASKWING_COMMANDS_END -->

## Autonomous Task Execution (Hooks)
Expand Down
4 changes: 3 additions & 1 deletion cmd/bootstrap.go
Original file line number Diff line number Diff line change
Expand Up @@ -721,7 +721,9 @@ func installMCPServers(basePath string, selectedAIs []string) {
case "claude":
installClaude(binPath, basePath)
case "gemini":
installGeminiCLI(binPath, basePath)
if err := installGeminiCLI(binPath, basePath); err != nil {
fmt.Printf("⚠️ Gemini MCP install failed: %v\n", err)
}
case "codex":
installCodexGlobal(binPath, basePath)
case "cursor":
Expand Down
44 changes: 28 additions & 16 deletions cmd/mcp_install.go
Original file line number Diff line number Diff line change
Expand Up @@ -107,8 +107,7 @@ func installMCPForTarget(target, binPath, cwd string) error {
installCodexGlobal(binPath, cwd)
return nil
case "gemini":
installGeminiCLI(binPath, cwd)
return nil
return installGeminiCLI(binPath, cwd)
case "copilot":
installCopilot(binPath, cwd)
return nil
Expand Down Expand Up @@ -445,14 +444,20 @@ func installCopilot(binPath, projectDir string) {
fmt.Println(" (Reload VS Code window to activate)")
}

func installGeminiCLI(binPath, projectDir string) {
func installGeminiCLI(binPath, projectDir string) error {
// Check if gemini CLI is available
_, err := exec.LookPath("gemini")
geminiPath, err := exec.LookPath("gemini")
if err != nil {
fmt.Println("❌ 'gemini' CLI not found in PATH.")
fmt.Println(" Please install the Gemini CLI first to use this integration.")
fmt.Println(" See: https://geminicli.com/docs/getting-started")
return
return fmt.Errorf("'gemini' CLI not found in PATH: install from https://geminicli.com/docs/getting-started")
}

// Check gemini version to detect compatibility issues
versionCmd := exec.Command(geminiPath, "--version")
versionOut, versionErr := versionCmd.Output()
if versionErr != nil {
fmt.Printf("⚠️ Could not determine gemini version: %v\n", versionErr)
} else if viper.GetBool("verbose") {
fmt.Printf(" gemini version: %s\n", strings.TrimSpace(string(versionOut)))
}

serverName := mcpServerName(projectDir)
Expand All @@ -461,35 +466,42 @@ func installGeminiCLI(binPath, projectDir string) {

if viper.GetBool("preview") {
fmt.Printf("[PREVIEW] Would run: gemini mcp remove -s project %s && gemini mcp add -s project %s %s mcp\n", legacyName, serverName, binPath)
return
return nil
}

// Remove legacy server name (migration cleanup)
legacyRemoveCmd := exec.Command("gemini", "mcp", "remove", "-s", "project", legacyName)
legacyRemoveCmd := exec.Command(geminiPath, "mcp", "remove", "-s", "project", legacyName)
legacyRemoveCmd.Dir = projectDir
_ = legacyRemoveCmd.Run() // Ignore error - server may not exist

// Remove current server name (idempotent reinstall)
removeCmd := exec.Command("gemini", "mcp", "remove", "-s", "project", serverName)
removeCmd := exec.Command(geminiPath, "mcp", "remove", "-s", "project", serverName)
removeCmd.Dir = projectDir
_ = removeCmd.Run() // Ignore error - server may not exist

// Run: gemini mcp add -s project <name> <command> [args...]
// Uses -s project for project-level config (stored in .gemini/settings.json)
cmd := exec.Command("gemini", "mcp", "add", "-s", "project", serverName, binPath, "mcp")
cmd := exec.Command(geminiPath, "mcp", "add", "-s", "project", serverName, binPath, "mcp")
cmd.Dir = projectDir

// Capture output to suppress noise, unless verbose
var stderrBuf strings.Builder
if viper.GetBool("verbose") {
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
} else {
cmd.Stderr = &stderrBuf
}

if err := cmd.Run(); err != nil {
fmt.Printf("⚠️ Failed to run 'gemini mcp add': %v\n", err)
} else {
fmt.Printf("✅ Installed for Gemini as '%s'\n", serverName)
stderrMsg := strings.TrimSpace(stderrBuf.String())
if stderrMsg != "" {
return fmt.Errorf("'gemini mcp add' failed (exit %v): %s", err, stderrMsg)
}
return fmt.Errorf("'gemini mcp add' failed: %w", err)
}

fmt.Printf("✅ Installed for Gemini as '%s'\n", serverName)
return nil
}

func installCodexGlobal(binPath, projectDir string) {
Expand Down
15 changes: 13 additions & 2 deletions internal/agents/core/parsers.go
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,17 @@ func NewFindingWithEvidence(
metadata map[string]any,
) Finding {
confidenceScore, confidenceLabel := ParseConfidence(confidence)
convertedEvidence := ConvertEvidence(evidence)

// Gate 3: Findings without verifiable evidence (non-empty FilePath) start as
// "skipped" rather than "pending" to prevent hallucinated findings from being
// auto-linked into the knowledge graph.
verificationStatus := VerificationStatusPending
tempFinding := Finding{Evidence: convertedEvidence}
if !tempFinding.HasEvidence() {
verificationStatus = VerificationStatusSkipped
}
Comment on lines +62 to +71
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gate 3 bypass: empty-FilePath evidence slips through as Pending

The gate check at line 67 uses len(convertedEvidence) == 0 as the sole criterion for flagging as Skipped. However, ConvertEvidence() (line 26–34) performs a direct struct copy without filtering:

evidence[i] = Evidence(e) // empty FilePath preserved as-is

This creates a divergence with HasEvidence() in types.go (line 65–71), which explicitly requires FilePath != "". An LLM response with [{"file_path": "", "snippet": "..."}] will:

  • Pass the gate: len(convertedEvidence) == 1VerificationStatus = Pending
  • Fail HasEvidence(): empty FilePath → returns false

This allows hallucinated findings with empty file paths to be auto-linked into the knowledge graph, defeating the Gate 3 enforcement intended to prevent exactly this.

Fix — align the gate with HasEvidence() by checking that evidence entries have non-empty FilePath:

Suggested change
convertedEvidence := ConvertEvidence(evidence)
// Gate 3: Findings without evidence start as "skipped" rather than "pending"
// to prevent them from being auto-linked into the knowledge graph.
verificationStatus := VerificationStatusPending
if len(convertedEvidence) == 0 {
verificationStatus = VerificationStatusSkipped
}
convertedEvidence := ConvertEvidence(evidence)
// Gate 3: Findings without verifiable evidence (non-empty FilePath) start as
// "skipped" rather than "pending" to prevent hallucinated findings from being
// auto-linked into the knowledge graph.
verificationStatus := VerificationStatusPending
tempFinding := Finding{Evidence: convertedEvidence}
if !tempFinding.HasEvidence() {
verificationStatus = VerificationStatusSkipped
}
Prompt To Fix With AI
This is a comment left during a code review.
Path: internal/agents/core/parsers.go
Line: 62-69

Comment:
**Gate 3 bypass: empty-FilePath evidence slips through as Pending**

The gate check at line 67 uses `len(convertedEvidence) == 0` as the sole criterion for flagging as `Skipped`. However, `ConvertEvidence()` (line 26–34) performs a direct struct copy without filtering:

```go
evidence[i] = Evidence(e) // empty FilePath preserved as-is
```

This creates a divergence with `HasEvidence()` in types.go (line 65–71), which explicitly requires `FilePath != ""`. An LLM response with `[{"file_path": "", "snippet": "..."}]` will:
- Pass the gate: `len(convertedEvidence) == 1``VerificationStatus = Pending`
- Fail `HasEvidence()`: empty `FilePath` → returns `false`

This allows hallucinated findings with empty file paths to be auto-linked into the knowledge graph, defeating the Gate 3 enforcement intended to prevent exactly this.

Fix — align the gate with `HasEvidence()` by checking that evidence entries have non-empty FilePath:

```suggestion
	convertedEvidence := ConvertEvidence(evidence)

	// Gate 3: Findings without verifiable evidence (non-empty FilePath) start as
	// "skipped" rather than "pending" to prevent hallucinated findings from being
	// auto-linked into the knowledge graph.
	verificationStatus := VerificationStatusPending
	tempFinding := Finding{Evidence: convertedEvidence}
	if !tempFinding.HasEvidence() {
		verificationStatus = VerificationStatusSkipped
	}
```

How can I resolve this? If you propose a fix, please make it concise.

Fix in Claude Code

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 8255ebe — gate now uses HasEvidence() which checks for non-empty FilePath.


return Finding{
Type: findingType,
Title: title,
Expand All @@ -67,8 +78,8 @@ func NewFindingWithEvidence(
Tradeoffs: tradeoffs,
ConfidenceScore: confidenceScore,
Confidence: confidenceLabel,
Evidence: ConvertEvidence(evidence),
VerificationStatus: VerificationStatusPending,
Evidence: convertedEvidence,
VerificationStatus: verificationStatus,
SourceAgent: sourceAgent,
Metadata: metadata,
}
Expand Down
Loading
Loading