Skip to content

fix: normalize agent IDs and remove bootstrap files for benchmark#37

Open
zhuanghaoz wants to merge 2 commits intopinchbench:mainfrom
zhuanghaoz:fix/agent-id-normalization
Open

fix: normalize agent IDs and remove bootstrap files for benchmark#37
zhuanghaoz wants to merge 2 commits intopinchbench:mainfrom
zhuanghaoz:fix/agent-id-normalization

Conversation

@zhuanghaoz
Copy link

  • Fix agent ID normalization to handle lowercase transformation
  • Remove BOOTSTRAP.md, SOUL.md, USER.md, IDENTITY.md before running tasks
  • Fix model ID normalization to preserve provider-qualified models (e.g., minimax-cn/)

These fixes ensure benchmark tasks work correctly with OpenClaw agents.

@kilo-code-bot
Copy link
Contributor

kilo-code-bot bot commented Mar 9, 2026

Code Review Summary

Status: 1 Issue Found | Recommendation: Address before merge

Overview

Severity Count
CRITICAL 0
WARNING 1
SUGGESTION 0
Issue Details (click to expand)

WARNING

File Line Issue
scripts/benchmark.py 21 Duplicate import time — the new line adds import time but it already exists at line 23, resulting in a redundant import
Other Observations (not in diff)

No additional issues found outside the diff.

Files Reviewed (2 files)
  • scripts/benchmark.py - 1 issue (duplicate import)
  • scripts/lib_agent.py - 0 issues (case-normalization fixes, bootstrap cleanup, and skill copying logic look correct)

Fix these issues in Kilo Cloud

- Fix agent ID normalization to handle lowercase transformation
- Remove BOOTSTRAP.md, SOUL.md, USER.md, IDENTITY.md before running tasks
- Fix model ID normalization to preserve provider-qualified models (e.g., minimax-cn/)

These fixes ensure benchmark tasks work correctly with OpenClaw agents.
@zhuanghaoz zhuanghaoz force-pushed the fix/agent-id-normalization branch from 0c5418b to 0cccb85 Compare March 9, 2026 14:46
- Copy skills from main workspace to benchmark workspace so agents can use nano-pdf
- Add 2-second delay before grading to ensure files are flushed to disk
- Fix model ID normalization to preserve provider-qualified models
import os
import statistics
import subprocess
import time
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: Duplicate import time — this line adds import time but it already exists at line 23. The duplicate should be removed.

Suggested change
import time

@olearycrew
Copy link
Member

@zhuanghaoz thanks for this contribution

I am wondering if "Remove BOOTSTRAP.md, SOUL.md, USER.md, IDENTITY.md before running tasks" is a good idea - I had problems early in this project with OpenClaw not linking not having those and getting lost on actual task completion

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants