[FEATURE]: Added a Interactive dashboard terminal for testing commands against the safety engine.#118
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughAdds a backend POST /api/v1/safety/check with Pydantic schemas and tests, TypeScript client types and api.checkSafety, and a Next.js Safety Lab page implementing command runs, demo mode, threat-path visualization, and run history. ChangesSafety Lab Feature Implementation
Sequence DiagramsequenceDiagram
participant User
participant SafetyLabUI
participant api.checkSafety
participant SafetyCheckEndpoint
participant SafetyEngine
participant RiskScorer
User->>SafetyLabUI: Enter command & click Run
SafetyLabUI->>api.checkSafety: POST command/tool/args
api.checkSafety->>SafetyCheckEndpoint: POST /api/v1/safety/check
SafetyCheckEndpoint->>SafetyEngine: Simulate check with current policy (ToolCallData)
SafetyEngine-->>SafetyCheckEndpoint: Risk level & blocked decision
SafetyCheckEndpoint->>RiskScorer: Score configured patterns (pattern, regex)
RiskScorer-->>SafetyCheckEndpoint: Pattern match results (matched, score)
SafetyCheckEndpoint-->>api.checkSafety: SafetyCheckResponse (decision, threat_path)
api.checkSafety-->>SafetyLabUI: Typed response
SafetyLabUI->>SafetyLabUI: Update history & render results
SafetyLabUI->>User: Display block explanation & threat path
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@agentwatch/api/server.py`:
- Around line 589-595: merged_args currently allows request.arguments to
overwrite the normalized command (cmd), causing raw_command vs parsed command
desync; change the merge so that "command": cmd takes precedence (e.g., build
merged_args from request.arguments then set merged_args["command"]=cmd) before
constructing ToolCallData, ensuring merged_args, raw_command, and tool_call
arguments remain consistent; update the code around merged_args, ToolCallData,
request.arguments, and cmd accordingly.
In `@frontend/pages/safety-lab.tsx`:
- Around line 72-82: The interval in the useEffect for demoMode captures a stale
runCommand closure and can start overlapping runs; fix by making the execution
guard stable: create an in-flight ref (e.g., inFlightRef) or memoize runCommand
with useCallback including its deps (such as running) so the interval always
uses the current state, then inside the setInterval handler check
inFlightRef.current (or the stabilized running) and skip starting a new run if
one is in progress, and ensure you set/clear the ref around the async runCommand
call (set true before await and false after) so PRESET_COMMANDS and demoMode
behavior remains unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 63135c61-a7ba-4d3e-81d9-342960b79671
📒 Files selected for processing (4)
agentwatch/api/server.pyfrontend/lib/api.tsfrontend/pages/safety-lab.tsxtests/test_api_safety_check.py
|
@Moresh-Mishra Thanks for the work — the UI looks great and the overflow issue appears resolved. Before merge, could you please address the two CodeRabbit findings:
Once those are updated and CI passes, I'll take another look. 🚀 |
|
Ok will resolve the issues @sreerevanth |
|
@sreerevanth I have made the changes coderabbit asked for. You can merge them safely. Could you also add labels to this PR. Thanks! |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
frontend/pages/safety-lab.tsx (1)
240-260:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winClicking a history entry only updates
selectedNode, not the displayed run.The Block Explanation and Threat Path panels are bound to
latest = history[0]?.result(Line 45), so selecting an older run mutatesselectedNodewhile the chips and explanation still reflect the most recent run. The resulting node detail (Lines 222-230) can reference apolicy_idthat isn't among the rendered chips, which is confusing.Track the active run explicitly and derive the panels from it.
♻️ Suggested wiring
- const [selectedNode, setSelectedNode] = useState<ThreatPathNode | null>(null) + const [selectedNode, setSelectedNode] = useState<ThreatPathNode | null>(null) + const [selectedRunId, setSelectedRunId] = useState<string | null>(null)- const latest = history[0]?.result + const activeRun = history.find((r) => r.id === selectedRunId) ?? history[0] + const latest = activeRun?.result<button key={item.id} onClick={() => { + setSelectedRunId(item.id) setSelectedNode(item.result.threat_path.find((node) => node.matched) ?? null) }}Also reset
selectedRunIdto the new run (ornull) insiderunCommandso a fresh run takes focus.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@frontend/pages/safety-lab.tsx` around lines 240 - 260, The UI only updates selectedNode when clicking history entries but the Block Explanation and Threat Path panels are always derived from latest = history[0]?.result, causing mismatch; introduce and use a tracked active run id (e.g., selectedRunId) and derive the active result (activeResult = history.find(h => h.id === selectedRunId)?.result ?? history[0]?.result) for the Block Explanation, Threat Path chips and node details instead of latest, update the history button onClick to set selectedRunId = item.id (and set selectedNode as you already do), and ensure runCommand resets selectedRunId to null or the new run id so new runs take focus.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@frontend/pages/safety-lab.tsx`:
- Around line 240-260: The UI only updates selectedNode when clicking history
entries but the Block Explanation and Threat Path panels are always derived from
latest = history[0]?.result, causing mismatch; introduce and use a tracked
active run id (e.g., selectedRunId) and derive the active result (activeResult =
history.find(h => h.id === selectedRunId)?.result ?? history[0]?.result) for the
Block Explanation, Threat Path chips and node details instead of latest, update
the history button onClick to set selectedRunId = item.id (and set selectedNode
as you already do), and ensure runCommand resets selectedRunId to null or the
new run id so new runs take focus.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 5ef1c915-f5e6-4527-9abd-4468cf092f0d
📒 Files selected for processing (3)
agentwatch/api/server.pyfrontend/pages/safety-lab.tsxtests/test_api_safety_check.py
🚧 Files skipped from review as they are similar to previous changes (2)
- tests/test_api_safety_check.py
- agentwatch/api/server.py
Summary
Short explanation of the change.
This change fixes an issue in the Safety Lab where the threat-path chips could push the page width and produce a horizontal scrollbar. The threat-path now wraps and each chip has a max width so long policy IDs/reasons do not force overflow.
Closes #19
Changes
Minor: Adjusted the threat-path chip CSS classes to keep visual emphasis for matched nodes while allowing multi-line layout.
Validation
Ran tests: pytest tests/test_api_safety_check.py -q — 2 passed.
Tested locally: started backend + Next dev server, exercised Safety Lab UI, and verified the threat-path now wraps across multiple lines and the page no longer shows a horizontal scrollbar for long/large threat paths.
Manual visual check: clicked several preset commands and verified selected node details still render correctly after the layout change.
Tests
tested locally using:
From repo root:
backend (venv active)
python -m uvicorn agentwatch.api.server:app --host 0.0.0.0 --port 8000
in another terminal: frontend
cd frontend
NEXT_PUBLIC_API_URL="http://localhost:8000/api/v1" npm run dev
run tests
pytest tests/test_api_safety_check.py -q
Screenshot
Frontend:

Summary by CodeRabbit
New Features
Tests