Run type: public-demo rehearsal, not design-partner validation
Reviewer: @pengfei-threemoonslab
Review slot: 2026-05-20 15:48:14 PDT / immediate review
Shipgate ref: 5ba36d9
Target SHA: 9514473c234c8419b812b658157a5c3d4341713f
Target
- Repo: https://github.com/openai/openai-agents-python
- Primary target: examples/customer_service
- Fallback target: another OpenAI Agents SDK example only if the primary target is evidence-empty
- Framing: public-demo rehearsal only; not upstream owner validation and not design-partner validation.
Run Notes
To be filled during execution:
- Exact target SHA: 9514473c234c8419b812b658157a5c3d4341713f
- Setup timing, cold cache: 15 seconds for clone + venv + pinned Shipgate install
- Warm-cache under 10 minutes: yes on this machine; first setup completed under 10 minutes
- First-run failures: none at command/runtime level; advisory scan exited 0. Release decision is insufficient_evidence due to low-confidence extracted tools.
- Manifest edits: generated examples/customer_service/shipgate.yaml, replaced placeholders with airline customer service agent name and purpose; did not run --agent-instructions.
- Scan decision: insufficient_evidence
- Blockers / review items: 0 blockers, 3 review items
- Privacy audit enabled: true
- Privacy redacted occurrence count: 0
- Privacy output surfaces: json, packet_html, packet_json, packet_md
- False positives: pending reviewer review
- Missing checks: pending reviewer review
- Confusing wording: packet §1 says insufficient_evidence due to 2 low-confidence tools, but packet §10 residuals says low-confidence tool extractions: none.
- Install friction: v0.10.0 tag absent; used pinned commit SHA. Pinned git install worked.
- Reviewer feedback: pending @pengfei-threemoonslab review
- Would advisory CI be acceptable: pending reviewer review; local advisory-only workflow draft prepared.
Reviewer Brief
This is a public-demo rehearsal, not owner validation. A virgin advisory run will look noisier than a maintained baseline-backed workflow. Advisory exit 0 means findings did not fail CI; config, parse, missing-source, and runtime errors are first-run failures and should be categorized separately.
Existing Benchmark Cross-Reference
This run coexists with benchmark/repos/openai-agents-sdk. It does not replace that fixture. If this rehearsal exposes repeated friction, create a follow-up to add or adjust a public-demo OpenAI SDK archetype.
Run type: public-demo rehearsal, not design-partner validation
Reviewer: @pengfei-threemoonslab
Review slot: 2026-05-20 15:48:14 PDT / immediate review
Shipgate ref: 5ba36d9
Target SHA: 9514473c234c8419b812b658157a5c3d4341713f
Target
Run Notes
To be filled during execution:
Reviewer Brief
This is a public-demo rehearsal, not owner validation. A virgin advisory run will look noisier than a maintained baseline-backed workflow. Advisory exit 0 means findings did not fail CI; config, parse, missing-source, and runtime errors are first-run failures and should be categorized separately.
Existing Benchmark Cross-Reference
This run coexists with benchmark/repos/openai-agents-sdk. It does not replace that fixture. If this rehearsal exposes repeated friction, create a follow-up to add or adjust a public-demo OpenAI SDK archetype.