ci: add GitHub Actions workflow to run tests on PRs#99
Conversation
7a61a37 to
9cd77ab
Compare
Previously no CI workflow ran pytest on pull requests, so regressions could land undetected. This adds a test.yml workflow triggered on PRs and pushes to main. - Split into three parallel jobs: core, inspect, helm - Core job runs tests that need no optional dependencies - Inspect and HELM jobs install their respective extras - Pin action versions to match regenerate_types.yml - Add timeout-minutes: 10 to guard against hung jobs - Update uv.lock Closes evaleval#88
9cd77ab to
7f66665
Compare
|
@nelaturuharsha not sure if this is what you had in mind -- happy to update it in any way you'd prefer! |
|
Would it make sense to install all requirements and then just run all tests? I worry that having lines like: will not be updated if more files with relevant inspect tests get added. It would feel safer to me if the test command was just |
Erotemic
left a comment
There was a problem hiding this comment.
I think the current PR could be accepted as is just to get something running, but I think we should avoid the globs if possible, and I think that adding locked/loose and core/full runs gives the best bang for the buck in terms of running different flavors of CI.
| name: Tests | ||
|
|
||
| on: | ||
| pull_request: | ||
| push: | ||
| branches: [main] | ||
|
|
||
| jobs: | ||
| core: | ||
| name: Core tests | ||
| runs-on: ubuntu-latest | ||
| timeout-minutes: 10 | ||
| steps: | ||
| - uses: actions/checkout@v6.0.2 | ||
| - uses: astral-sh/setup-uv@v7.6.0 | ||
|
|
||
| - name: Install core dependencies | ||
| run: uv sync --locked | ||
|
|
||
| - name: Run core tests | ||
| run: | | ||
| uv run pytest \ | ||
| tests/test_validate.py \ | ||
| tests/test_check_duplicate_entries.py \ | ||
| tests/test_inspect_uuid_utils.py \ | ||
| tests/test_cli_inspect_uuid.py \ | ||
| tests/test_lm_eval_adapter.py \ | ||
| -v | ||
|
|
||
| inspect: | ||
| name: Inspect converter tests | ||
| runs-on: ubuntu-latest | ||
| timeout-minutes: 10 | ||
| steps: | ||
| - uses: actions/checkout@v6.0.2 | ||
| - uses: astral-sh/setup-uv@v7.6.0 | ||
|
|
||
| - name: Install dependencies with inspect extra | ||
| run: uv sync --locked --extra inspect | ||
|
|
||
| - name: Run inspect tests | ||
| run: | | ||
| uv run pytest \ | ||
| tests/test_inspect_adapter.py \ | ||
| tests/test_inspect_instance_level_adapter.py \ | ||
| -v | ||
|
|
||
| helm: | ||
| name: HELM converter tests | ||
| runs-on: ubuntu-latest | ||
| timeout-minutes: 10 | ||
| steps: | ||
| - uses: actions/checkout@v6.0.2 | ||
| - uses: astral-sh/setup-uv@v7.6.0 | ||
|
|
||
| - name: Install dependencies with helm extra | ||
| run: uv sync --locked --extra helm | ||
|
|
||
| - name: Run HELM tests | ||
| run: | | ||
| uv run pytest \ | ||
| tests/test_helm_adapter.py \ | ||
| tests/test_helm_instance_level_adapter.py \ | ||
| -v |
There was a problem hiding this comment.
| name: Tests | |
| on: | |
| pull_request: | |
| push: | |
| branches: [main] | |
| jobs: | |
| core: | |
| name: Core tests | |
| runs-on: ubuntu-latest | |
| timeout-minutes: 10 | |
| steps: | |
| - uses: actions/checkout@v6.0.2 | |
| - uses: astral-sh/setup-uv@v7.6.0 | |
| - name: Install core dependencies | |
| run: uv sync --locked | |
| - name: Run core tests | |
| run: | | |
| uv run pytest \ | |
| tests/test_validate.py \ | |
| tests/test_check_duplicate_entries.py \ | |
| tests/test_inspect_uuid_utils.py \ | |
| tests/test_cli_inspect_uuid.py \ | |
| tests/test_lm_eval_adapter.py \ | |
| -v | |
| inspect: | |
| name: Inspect converter tests | |
| runs-on: ubuntu-latest | |
| timeout-minutes: 10 | |
| steps: | |
| - uses: actions/checkout@v6.0.2 | |
| - uses: astral-sh/setup-uv@v7.6.0 | |
| - name: Install dependencies with inspect extra | |
| run: uv sync --locked --extra inspect | |
| - name: Run inspect tests | |
| run: | | |
| uv run pytest \ | |
| tests/test_inspect_adapter.py \ | |
| tests/test_inspect_instance_level_adapter.py \ | |
| -v | |
| helm: | |
| name: HELM converter tests | |
| runs-on: ubuntu-latest | |
| timeout-minutes: 10 | |
| steps: | |
| - uses: actions/checkout@v6.0.2 | |
| - uses: astral-sh/setup-uv@v7.6.0 | |
| - name: Install dependencies with helm extra | |
| run: uv sync --locked --extra helm | |
| - name: Run HELM tests | |
| run: | | |
| uv run pytest \ | |
| tests/test_helm_adapter.py \ | |
| tests/test_helm_instance_level_adapter.py \ | |
| -v | |
| name: Tests | |
| on: | |
| pull_request: | |
| push: | |
| branches: [main] | |
| permissions: | |
| contents: read | |
| jobs: | |
| tests: | |
| name: ${{ matrix.deps }} / ${{ matrix.resolution }} | |
| runs-on: ubuntu-latest | |
| timeout-minutes: 10 | |
| strategy: | |
| fail-fast: false | |
| matrix: | |
| include: | |
| - deps: core | |
| resolution: locked | |
| sync: uv sync --locked | |
| pytest_args: >- | |
| tests | |
| --ignore-glob=tests/test_inspect*.py | |
| --ignore-glob=tests/test_helm*.py | |
| -v | |
| - deps: core | |
| resolution: loose | |
| sync: uv sync --upgrade | |
| pytest_args: >- | |
| tests | |
| --ignore-glob=tests/test_inspect*.py | |
| --ignore-glob=tests/test_helm*.py | |
| -v | |
| - deps: full | |
| resolution: locked | |
| sync: uv sync --locked --all-extras | |
| pytest_args: tests -v | |
| - deps: full | |
| resolution: loose | |
| sync: uv sync --upgrade --all-extras | |
| pytest_args: tests -v | |
| steps: | |
| - uses: actions/checkout@v6.0.2 | |
| - uses: astral-sh/setup-uv@v7.6.0 | |
| - name: Install dependencies | |
| run: ${{ matrix.sync }} | |
| - name: Run tests | |
| run: uv run pytest ${{ matrix.pytest_args }} |
There was a problem hiding this comment.
This is the style of CI that I like to use. I also like to use a "core" or "minimal" test of tests that just target the plain library without any extras. When dealing with extras instead of worrying about diffrent combinations I like to use a "full" version where all the extras are installed. We could split out the inspect / helm versions but I think it's not worth the effort.
What I do think is worth the effort is a locked / loose install check. I find it extremely valuable to test the locked version of the deps as well as the deps that would be installed if a user pip installed today. This gives a strong signal when an upstream package causes a break in this package.
For now I think it's fine to use the --ignore-glob, but really we should just use pytest to mark which tests should / shouldn't run based on the installed packages, but the glob method is good enough for now.
|
Closed by #135 |
Previously no CI workflow ran pytest on pull requests, so regressions could land undetected. This adds a test.yml workflow triggered on PRs and pushes to main.
Closes #88