QUICKSTART — KernelCI pull-lab integration

This guide gets pullab_cloud polling KernelCI for pull-lab jobs and pushing results to KCIDB. The flow is:

kernelci-api->poll (/events)
pull_labs_poller translates the job definition
pull_labs_poller calls the existing AWS pipeline with the translated config
pull_labs_poller submits tests-only revisions to kcidb-restd-rs

For the underlying AWS setup (IAM roles, S3 buckets, ECS cluster, ECR image, etc.) see the main README.md. This file only covers the KernelCI/KCIDB wiring on top of an already-working AWS pipeline.

1. Set up the AWS pipeline first

The KernelCI poller drives the existing AWS pipeline — it does not provision AWS resources on its own. Before continuing, make sure the pipeline can run a job end-to-end. The full walkthrough lives in README.md; the minimum steps are:

Install the package in a venv — Python 3.11 required, see README → Installation:

python3.11 -m venv .venv && source .venv/bin/activate
python3.11 -m pip install -e .

Configure AWS credentials — see README 1. Configure AWS Credentials. Either aws configure / an IAM role, or drop examples/aws/credentials.json in place.
Generate the pipeline config — see README 2. Configure the project:
```
kernel-ci-cloud-runner aws setup configure \
  --prefix kernel-ci-$USER- --region us-west-2
```
This populates examples/aws/config.json with unique S3/IAM/ECS/ECR names. Use --output my-config.json to write to a different path (then pass PULLAB_BASE_CONFIG=my-config.json to the poller).
Verify the pipeline works — see README 3. Run integration test to verify setup:
```
pytest tests/integration/ -v -m integration
```
Look for VMs: 2/2 spawned, 2 successful, 0 failed. If this passes, the AWS side is ready and you can proceed below.

For a fast pre-flight check (no VMs spawned) of AWS permissions, the results bucket, and KernelCI/KCIDB tokens, run:
```
kernel-ci-cloud-runner aws setup validate \
  --bucket kernel-ci-$USER-results --role kernel-ci-$USER-vm-role
```
See README → Validate setup. The KERNELCI_API_TOKEN / KCIDB_JWT / UNIFIED_TOKEN env vars set in section 3 below are picked up automatically.

If your jobs install custom kernels, also follow README 4. Upload kernel RPMs.

To tear everything down afterwards, see README 7. Clean up resources.

Prerequisites

A working AWS pipeline — kernel-ci-cloud-runner aws setup configure has been run and examples/aws/config.json is populated (see section 1 above).
A reachable kernelci-api with at least one pull-lab job scheduled to pull-labs-aws-ec2 (or whatever runtime name you use).
A reachable kcidb-restd-rs /submit endpoint.
A JWT signed with the kcidb-restd-rs unified_secret carrying the origin you'll use for these rows.

2. Configure the kernelci section

Open examples/aws/config.json and edit the kernelci block that was added alongside test_config:

"kernelci": {
  "api_base_uri":    "https://staging.kernelci.org:9000/latest",
  "api_token":       null,
  "runtime_name":    "pull-labs-aws-ec2",
  "poll_interval_sec": 30,
  "cursor_file":     "/tmp/pullab_cloud_cursor.json",
  "kcidb_submit_url":"https://db.kernelci.org/submit",
  "kcidb_origin":    "pullab_cloud_aws",
  "kcidb_jwt":       null
}

Secrets (api_token, kcidb_jwt) are normally injected via environment variables, not committed to the file:

Env var	Falls back to	Purpose
`KERNELCI_API_BASE_URI`	`kernelci.api_base_uri`	API URL
`KERNELCI_API_TOKEN`	`kernelci.api_token`	Bearer token for the API (optional for public endpoints)
`KERNELCI_RUNTIME_NAME`	`kernelci.runtime_name`	Lab/runtime to consume jobs for
`KERNELCI_PLATFORMS`	`kernelci.platforms`	Comma-separated allowlist of `node.data.platform` values. Optional; used to split one runtime label across multiple pollers (e.g. arm64 vs x86_64). Omit to accept any platform.
`KCIDB_SUBMIT_URL`	`kernelci.kcidb_submit_url`	kcidb-restd-rs submit URL
`KCIDB_JWT`	`kernelci.kcidb_jwt`	JWT bearer token
`KCIDB_REST`	(alternative to the two above)	`https://<token>@host[/path]` — kci-dev-compatible single URL carrying both endpoint and token
`UNIFIED_TOKEN`	(fallback for `KERNELCI_API_TOKEN` and `KCIDB_JWT`)	Single token used for both when the dedicated env vars aren't set. Lower priority than the specific vars, higher than config-file values.
`KCIDB_ORIGIN`	`kernelci.kcidb_origin`	Origin string in submitted rows
`PULLAB_CURSOR_FILE`	`kernelci.cursor_file`	Where to persist the poll cursor
`PULLAB_POLL_INTERVAL_SEC`	`kernelci.poll_interval_sec`	Sleep between empty polls
`PULLAB_BASE_CONFIG`	`examples/aws/config.json`	Path to base config

3. Run a single poll cycle (dry test)

export KCIDB_JWT="eyJ...your.token..."
make poller-once
# or:
python -m kernel_ci_cloud_labs.pull_labs_poller --config examples/aws/config.json --once

What happens:

Fetches /events?state=available&kind=job&recursive=true&from=<cursor>.
Skips events whose node.data.data.runtime ≠ runtime_name.
For each matching event, downloads node.artifacts.job_definition JSON.
Walks node.parent to find the kbuild ancestor and builds build_id = "<kcidb_origin>:<kbuild_node_id>".
Translates the job → test_config.vms[*] and calls run_pipeline().
Submits one tests-only KCIDB revision per job.
Persists the latest event timestamp to the cursor file.

4. Run as a long-lived service

make poller
# or:
python -m kernel_ci_cloud_labs.pull_labs_poller --config examples/aws/config.json

Sleep interval between polls when there is nothing to do is PULLAB_POLL_INTERVAL_SEC (default 30s).

5. Run in AWS Lambda

The same module exposes lambda_handler(event, context) that runs one poll cycle per invocation. Wire it to an EventBridge schedule (e.g. every minute) and set the env vars on the Lambda function. The cursor file lives on /tmp by default — fine for steady polling, but configure PULLAB_CURSOR_FILE to a persistent path (or write a custom CursorStore backed by S3/DynamoDB) if you need true cross-cold-start deduplication.

Lambda handler entry point: kernel_ci_cloud_labs.pull_labs_poller.lambda_handler

6. Run in a container

The poller has no AWS-specific imports at the top level (other than the default executor which calls into the existing AWS pipeline). For a custom executor, instantiate PullLabsPoller directly:

from kernel_ci_cloud_labs.pull_labs_poller import PullLabsPoller
poller = PullLabsPoller(config, job_executor=my_executor)
poller.run_forever()

my_executor(run_config) -> (test_rows, log_url) is called once per job; test_rows is a list of {"name": str, "status": str, "duration_ms": Optional[int]} dicts.

7. Verify

Events reach the poller: run --once with --log-level DEBUG and confirm the poll URL and event count are logged.
Cursor advances: inspect cat /tmp/pullab_cloud_cursor.json after a cycle.
KCIDB receives rows: if kcidb-restd-rs is local, check its spool_directory for a submission-*.json. Open one and confirm tests[*] rows have your origin, a build_id of the form <origin>:<kbuild_node_id>, and statuses in {PASS, FAIL, SKIP, ERROR, MISS, DONE}.
AWS run actually ran: the existing pipeline logs land under logs/run_*/ and s3://<results-bucket>/run_pulllab-*/.
Payload shape (optional): if you have kci-dev installed, you can sanity-check our submission shape by capturing one payload (with logging) and piping it through:
```
kci-dev submit build --from-json <captured.json> --origin <kcidb_origin> --dry-run
```
Our poller speaks the same KCIDB v5.3 schema, so this should round-trip cleanly.

Troubleshooting

Symptom	Likely cause
`Missing required configuration: kernelci.kcidb_jwt`	env var not set and config value is null
Events come back but none are processed	`runtime_name` mismatch with what the scheduler set; check `node.data.data.runtime` in a raw event
`Could not resolve build_id` warning	Job node has no `kbuild` ancestor reachable within 8 hops, or `api_token` is missing for a protected `/node/{id}` endpoint
`Translation failed … missing required artifacts.kernel`	The KernelCI build that produced this job did not upload a kernel image
HTTP 401 from KCIDB submit	JWT not signed by the kcidb-restd-rs `unified_secret`, expired, or origin claim mismatch
Same job processed repeatedly	Cursor file path not persistent across restarts (Lambda `/tmp` is ephemeral across cold starts)

TODO:

We do no change yet job state from available. We have deduplication via the cursor, but if the poller restarts it may re-process some events.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

QUICKSTART — KernelCI pull-lab integration

1. Set up the AWS pipeline first

Prerequisites

2. Configure the kernelci section

3. Run a single poll cycle (dry test)

4. Run as a long-lived service

5. Run in AWS Lambda

6. Run in a container

7. Verify

Troubleshooting

TODO:

Uh oh!

FilesExpand file tree

QUICKSTART.md

Latest commit

History

QUICKSTART.md

File metadata and controls

QUICKSTART — KernelCI pull-lab integration

1. Set up the AWS pipeline first

Prerequisites

2. Configure the kernelci section

3. Run a single poll cycle (dry test)

4. Run as a long-lived service

5. Run in AWS Lambda

6. Run in a container

7. Verify

Troubleshooting

TODO: