Skip to content

feat: stabilize Cloud Run deployment and polish standalone CLI UX#389

Merged
IsmailMehdi merged 15 commits into
mainfrom
feat/add-codex-filter
May 12, 2026
Merged

feat: stabilize Cloud Run deployment and polish standalone CLI UX#389
IsmailMehdi merged 15 commits into
mainfrom
feat/add-codex-filter

Conversation

@IsmailMehdi
Copy link
Copy Markdown
Collaborator

@IsmailMehdi IsmailMehdi commented May 12, 2026

Description

This PR contains critical stability fixes for the Cloud Run deployment (supporting the recently merged package renaming and dependency
decoupling) and significantly polishes the user experience of the standalone CLI tool when run via launchers like uvx.

What Changed

☁️ Cloud Run Deployment Stability

  • Dockerfile Update: Configured uv sync to use the --all-packages flag. This ensures that even though viewer UI dependencies
    are decoupled from the core package, they are still resolved and installed in the container's environment.
  • Double-Sync: Ensured uv sync runs twice (pre-copy for dependency caching, post-copy for packaging) to fully compile the workspace
    during the build phase, preventing runtime attempts to download packages.

🖥️ Standalone CLI UX Polishing (absl Help Fixes)

  • Key Flags Promoted: Declared --experiment_config as a key flag so it displays directly in the basic --help output instead of being
    hidden inside --helpfull.
  • Bypassed Launcher Bug: Added runtime logic in the run() entrypoint to fix absl-py help rendering when wrapped by uv launcher
    scripts:
    • Overrides the launcher's polyglot Bash docstring with the actual clean EvalBench description.
    • Explicitly registers key flags under sys.argv[0] (the launcher's filename) to bypass a lookup discrepancy in absl-py.
    • Overrides sys.argv[0] with its basename to hide the ugly, long temporary path (e.g., /tmp/.../google-evalbench) in the help output
      header.

Verification

  • Verified that uv run evalbench/evalbench.py --help and uvx --no-cache --from . google-evalbench --help both output clean, professional
    help screens with the correct key flags.
  • Verified that make container builds cleanly and all local services start up successfully.

Ismail Mehdi and others added 14 commits May 11, 2026 19:42
- Add 'Codex' option to agent toggle buttons in status and list views.
- Implement filtering for 'codex_cli' generator in both views.
- Add missing 'mesop' and 'gunicorn' dependencies to pyproject.toml.

TAG=agy
CONV=5c0ca3b4-cd35-4f4b-aa14-bc902aaaf0c7
…nstalled.

- Updated pyproject.toml to include viewer as a workspace dependency.
- Updated Dockerfile to copy viewer/pyproject.toml before running uv sync.
- Updated supervisord configs to use uv run to ensure correct environment is used.
- Updated viewer/run_frontend.sh to use uv run gunicorn.
- Added .dockerignore to prevent copying local .venv.

TAG=agy
CONV=5c0ca3b4-cd35-4f4b-aa14-bc902aaaf0c7
…lly built.

- Added `RUN uv sync --frozen` after `COPY . .` in the Dockerfile.
  This ensures that the `viewer` workspace member is fully built and installed
  during the Docker build phase (with internet access), preventing `uv run`
  from trying to download `setuptools` at runtime in the restricted
  Cloud Run environment.

TAG=agy
CONV=5c0ca3b4-cd35-4f4b-aa14-bc902aaaf0c7
This ensures that all workspace packages (including the viewer UI) are fully built and installed during the Docker build phase, supporting clean decoupling of viewer dependencies in the core package.

TAG=agy
CONV=5c0ca3b4-cd35-4f4b-aa14-bc902aaaf0c7
This ensures that --experiment_config shows up in the basic --help output (instead of being hidden in --helpfull), making the standalone CLI interface much more user-friendly.

TAG=agy
CONV=5c0ca3b4-cd35-4f4b-aa14-bc902aaaf0c7
- Updated run() in evalbench.py to register key flags under both '__main__' and `sys.argv[0]`.
  This bypasses the absl-py translation bug where it looks up key flags for `sys.argv[0]` instead of '__main__' when rendering help for the entrypoint script.

TAG=agy
CONV=5c0ca3b4-cd35-4f4b-aa14-bc902aaaf0c7
TAG=agy
CONV=5c0ca3b4-cd35-4f4b-aa14-bc902aaaf0c7
@IsmailMehdi IsmailMehdi requested a review from wangauone May 12, 2026 19:41
@IsmailMehdi
Copy link
Copy Markdown
Collaborator Author

/gcbrun

@IsmailMehdi IsmailMehdi changed the title Feat/add codex filter feat: stabilize Cloud Run deployment and polish standalone CLI UX May 12, 2026
This prevents baking local evaluation results and Jetski agent state into the built Docker image, keeping it clean and reducing image size.

TAG=agy
CONV=5c0ca3b4-cd35-4f4b-aa14-bc902aaaf0c7
@IsmailMehdi
Copy link
Copy Markdown
Collaborator Author

/gcbrun

@IsmailMehdi IsmailMehdi merged commit 4720eef into main May 12, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants