Skip to content

Phase 2: Add Code Interpreter for Data Analysis#2

Merged
san360 merged 11 commits into
mainfrom
feature/phase2-code-interpreter
May 15, 2026
Merged

Phase 2: Add Code Interpreter for Data Analysis#2
san360 merged 11 commits into
mainfrom
feature/phase2-code-interpreter

Conversation

@san360
Copy link
Copy Markdown
Owner

@san360 san360 commented May 15, 2026

Summary

  • Adds code_interpreter tool alongside existing bing_grounding
  • Extends system prompt with ## Data Analysis section
  • Evaluation now runs all 8 queries (Phase 1 + Phase 2)

Changes

File Change
agents/tech-trends-agent.json Added code_interpreter to tools, phase → "2"
prompts/tech-trends-agent.md Added ## Data Analysis (Phase 2) section
evals/eval-config.json phase_filternull (run all cases)

What to check

  • Phase 1 queries still score at or above threshold (no regression)
  • Phase 2 data analysis queries score acceptably
  • Agent correctly uses code interpreter for calculation queries
  • After merge, deploy-prod.yml commits updated artifact

Phase

Phase 2 of 3 — web search + code interpreter. Phase 3 is model upgrade.

@github-actions
Copy link
Copy Markdown
Contributor

Agent Evaluation Results

Phase: 2
Model: gpt-4o-2024-11-20
Commit: 171d434

Full results are in the Actions summary.

@github-actions
Copy link
Copy Markdown
Contributor

Agent Evaluation Results

Phase: 2
Model: gpt-4o-2024-11-20
Commit: e3f7874

Full results are in the Actions summary.

@github-actions
Copy link
Copy Markdown
Contributor

Agent Evaluation Results

Phase: 2
Model: gpt-4o-2024-11-20
Commit: c10c09e

Full results are in the Actions summary.

- agents/tech-trends-agent.json: replace bing_grounding with code_interpreter
  (Phase 2 uses only code_interpreter, not both tools)
- evaluate.yml: add smoke test step that invokes the agent and validates a
  response before running the full evaluation suite; remove BING_CONNECTION_NAME
- bootstrap.sh: add model availability check (validates current + upgrade target
  gpt-4.1); remove Bing Grounding references and connection variable
- lifecycle/02-phase2-code-interpreter.sh: fix to deploy code_interpreter only
- README: document model upgrade (gpt-4o-2024-11-20 → gpt-4.1), note eval
  naming limitation (action creates new "Agent Evaluation" each run)
@github-actions
Copy link
Copy Markdown
Contributor

Agent Evaluation Results

Agent: tech-trends-agent:8
Phase: 2
Model: gpt-4o-2024-11-20
Commit: 1d2b294

Smoke Test:

Full results are in the Actions summary.

- get_agent() → agents.get(agent_name=...)
- Use OpenAI responses API via get_openai_client() for smoke test invocation
- Fix monitor.yml to use agent.latest_version
@github-actions
Copy link
Copy Markdown
Contributor

Agent Evaluation Results

Agent: tech-trends-agent:9
Phase: 2
Model: gpt-4o-2024-11-20
Commit: 9e7089c

Smoke Test:

Full results are in the Actions summary.

- Smoke test: just print agent name (version not needed for invocation)
- Monitor: use list_versions() to find the latest version number
@github-actions
Copy link
Copy Markdown
Contributor

Agent Evaluation Results

Agent: tech-trends-agent:10
Phase: 2
Model: gpt-4o-2024-11-20
Commit: c762af7

Smoke Test:

Full results are in the Actions summary.

The responses.create() call was missing the model parameter, causing
a 404 DeploymentNotFound error in Azure OpenAI.
@github-actions
Copy link
Copy Markdown
Contributor

Agent Evaluation Results

Agent: tech-trends-agent:11
Phase: 2
Model: gpt-4o-2024-11-20
Commit: c4b6798

Smoke Test:

Full results are in the Actions summary.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 15, 2026

✅ Agent Deployment & Evaluation Report

🤖 Agent Details

Property Value
Agent tech-trends-agent
Version 15
Semver 0.0.0-pr.2
Phase 2
Model gpt-4o
Commit b3ee88f
Timestamp 2026-05-15 19:39:10 UTC

📊 Pipeline Results

Step Status Details
Deploy to TEST PASSED Agent version 15 deployed
Smoke Test PASSED Invoked agent via Responses API
Foundry Evaluation PASSED Evaluated with golden dataset

🛠️ Tools Configuration

Tool Enabled
web_search
code_interpreter

🔗 Links


🤖 Updated automatically by the CI pipeline · 2026-05-15 19:39:10 UTC

san360 added 3 commits May 15, 2026 21:29
AgentDetails object returned by project.agents.get() has 'versions' (list),
not 'version' (scalar). Use versions[-1] for the latest version.
SDK model objects don't support negative indexing directly.
…sions is AgentObjectVersions (not a list).\nThe correct path is agent.versions.latest.version.\nVerified locally with SDK model objects.
@san360 san360 merged commit b6b3281 into main May 15, 2026
1 check passed
@san360 san360 deleted the feature/phase2-code-interpreter branch May 18, 2026 14:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant