Description:
An AI is only as good as its prompt. This issue is highly focused on optimizing the instructions passed to OpenClaw so that it behaves as an impartial, strict, and logical arbitrator. We must engineer a prompt template that explicitly instructs the agent to act as a judge in a freelancer dispute, detailing that it must evaluate the alignment between initial requirements and the final submitted work. The prompt must force the agent to return a rigidly structured JSON response containing three fields: Verdict Summary (the reasoning), Liability (who was at fault), and Payout Split (the exact percentages to return to client vs freelancer). Achieving deterministic output from the LLM requires extensive iterative testing with various edge cases and mock dispute scenarios.
Requirements:
- Design and embed the
Agent_Judge_System_Prompt inside the backend context builder.
- Ensure the prompt demands adherence to JSON schema output.
- Implement an evaluation pipeline in Rust that validates the JSON payload returned by OpenClaw.
- Handle scenarios where the LLM produces improperly formatted text and requires follow-up corrections.
Acceptance Criteria:
- The prompt demonstrably outputs clean, parseable JSON verdicts in 99% of test runs.
- The
Payout Split mathematically equals exactly 100% across all generated scenarios.
- The agent outputs distinct logical reasoning for its payout distribution in the
Verdict Summary.
Description:
An AI is only as good as its prompt. This issue is highly focused on optimizing the instructions passed to OpenClaw so that it behaves as an impartial, strict, and logical arbitrator. We must engineer a prompt template that explicitly instructs the agent to act as a judge in a freelancer dispute, detailing that it must evaluate the alignment between initial requirements and the final submitted work. The prompt must force the agent to return a rigidly structured JSON response containing three fields:
Verdict Summary(the reasoning),Liability(who was at fault), andPayout Split(the exact percentages to return to client vs freelancer). Achieving deterministic output from the LLM requires extensive iterative testing with various edge cases and mock dispute scenarios.Requirements:
Agent_Judge_System_Promptinside the backend context builder.Acceptance Criteria:
Payout Splitmathematically equals exactly 100% across all generated scenarios.Verdict Summary.