fix(rca): reduce prompt rigidity and let LLM decide when to load integration skills#436
fix(rca): reduce prompt rigidity and let LLM decide when to load integration skills#436isiddharthsingh wants to merge 4 commits into
Conversation
…ontradictory MUSTs in agent prompts
…onnected-integrations index
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
…y covered by parametrized case in test_force_tool_choice.py
|



Summary
The RCA agent prompt had three classes of rigidity that pushed the model to satisfy floors instead of judging when it had the answer, and to call integrations the alert didn't need just because they were connected:
behavioral_rules.mdsaid the model MUST callload_skillbefore any integration tool, while the background path explicitly said "do NOT call load_skill — skills are pre-loaded". Both fired in the same prompt.The change set replaces these with completion criteria and situational guidance, and swaps RCA's eager skill pre-loading for on-demand
load_skillvia a compact connected-integrations index — mirroring how foreground chat already works.What changed
Commit 1 — prompt rigidity (
c84328bc):persistence_and_immediate_action.md,background_source_general.md,investigation.md,behavioral_rules.mdwith evidence-backed stopping criteria.jira/SKILL.md,confluence/SKILL.md,notion/SKILL.md,knowledge_base.mdto situational guidance.load_skillMUST frombehavioral_rules.md; added a newinteractive_load_skill.mdsegment scoped to interactive chat and RCA.rcaephemeral inprovider_rules.pyso foreground RCA gets read-only language instead of the agent "you CAN and SHOULD create/modify/delete" block.[RCA INVESTIGATION REQUESTED]HumanMessage prefix inmain_chatbot.py— theForceToolChoicemiddleware is the single enforcer.Commit 2 — on-demand RCA skills (
3c1a64af):background.pyRCA path now emitsregistry.build_index(user_id)(~300 tokens) instead ofload_skills_for_rca(...)(5–15k tokens of full skill bodies).interactive_load_skillso the model gets the on-demand instruction in RCA too.load_skilltool description (dropped "MANDATORY" / "you MUST call this first").Behavioral verification
Two real RCAs run against the same alert (High CPU on aurora-dast cluster-staging):
a96d6d16)b86c3b43)The post-change run went metrics-first for a CPU alert, deferred human-context tools to the end, and called
load_skill('github')once when it actively needed GitHub workflow — exactly the behavior the change was designed to produce.Test plan
MINIMUM/AT LEAST [0-9]+floor mandates inskills/FIRST tool call MUST/MUST.*BEFORE any/ALWAYS search.*STARTpatternsMUST call load_skillline in core; "do NOT call load_skill" still present in action modecomposer.py,provider_rules.py,background.py,cloud_tools.py,main_chatbot.py)test_trigger_rca_middleware.py)