feat: stabilize Cloud Run deployment and polish standalone CLI UX#389
Merged
Conversation
- Add 'Codex' option to agent toggle buttons in status and list views. - Implement filtering for 'codex_cli' generator in both views. - Add missing 'mesop' and 'gunicorn' dependencies to pyproject.toml. TAG=agy CONV=5c0ca3b4-cd35-4f4b-aa14-bc902aaaf0c7
…nstalled. - Updated pyproject.toml to include viewer as a workspace dependency. - Updated Dockerfile to copy viewer/pyproject.toml before running uv sync. - Updated supervisord configs to use uv run to ensure correct environment is used. - Updated viewer/run_frontend.sh to use uv run gunicorn. - Added .dockerignore to prevent copying local .venv. TAG=agy CONV=5c0ca3b4-cd35-4f4b-aa14-bc902aaaf0c7
…lly built. - Added `RUN uv sync --frozen` after `COPY . .` in the Dockerfile. This ensures that the `viewer` workspace member is fully built and installed during the Docker build phase (with internet access), preventing `uv run` from trying to download `setuptools` at runtime in the restricted Cloud Run environment. TAG=agy CONV=5c0ca3b4-cd35-4f4b-aa14-bc902aaaf0c7
This ensures that all workspace packages (including the viewer UI) are fully built and installed during the Docker build phase, supporting clean decoupling of viewer dependencies in the core package. TAG=agy CONV=5c0ca3b4-cd35-4f4b-aa14-bc902aaaf0c7
This ensures that --experiment_config shows up in the basic --help output (instead of being hidden in --helpfull), making the standalone CLI interface much more user-friendly. TAG=agy CONV=5c0ca3b4-cd35-4f4b-aa14-bc902aaaf0c7
- Updated run() in evalbench.py to register key flags under both '__main__' and `sys.argv[0]`. This bypasses the absl-py translation bug where it looks up key flags for `sys.argv[0]` instead of '__main__' when rendering help for the entrypoint script. TAG=agy CONV=5c0ca3b4-cd35-4f4b-aa14-bc902aaaf0c7
TAG=agy CONV=5c0ca3b4-cd35-4f4b-aa14-bc902aaaf0c7
Collaborator
Author
|
/gcbrun |
This prevents baking local evaluation results and Jetski agent state into the built Docker image, keeping it clean and reducing image size. TAG=agy CONV=5c0ca3b4-cd35-4f4b-aa14-bc902aaaf0c7
Collaborator
Author
|
/gcbrun |
wangauone
approved these changes
May 12, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This PR contains critical stability fixes for the Cloud Run deployment (supporting the recently merged package renaming and dependency
decoupling) and significantly polishes the user experience of the standalone CLI tool when run via launchers like
uvx.What Changed
☁️ Cloud Run Deployment Stability
uv syncto use the--all-packagesflag. This ensures that even thoughviewerUI dependenciesare decoupled from the core package, they are still resolved and installed in the container's environment.
uv syncruns twice (pre-copy for dependency caching, post-copy for packaging) to fully compile the workspaceduring the build phase, preventing runtime attempts to download packages.
🖥️ Standalone CLI UX Polishing (
abslHelp Fixes)--experiment_configas a key flag so it displays directly in the basic--helpoutput instead of beinghidden inside
--helpfull.run()entrypoint to fixabsl-pyhelp rendering when wrapped byuvlauncherscripts:
sys.argv[0](the launcher's filename) to bypass a lookup discrepancy inabsl-py.sys.argv[0]with its basename to hide the ugly, long temporary path (e.g.,/tmp/.../google-evalbench) in the help outputheader.
Verification
uv run evalbench/evalbench.py --helpanduvx --no-cache --from . google-evalbench --helpboth output clean, professionalhelp screens with the correct key flags.
make containerbuilds cleanly and all local services start up successfully.