This guide gets pullab_cloud polling KernelCI for pull-lab jobs and pushing results to KCIDB. The flow is:
- kernelci-api->poll (/events)
- pull_labs_poller translates the job definition
- pull_labs_poller calls the existing AWS pipeline with the translated config
- pull_labs_poller submits tests-only revisions to kcidb-restd-rs
For the underlying AWS setup (IAM roles, S3 buckets, ECS cluster, ECR
image, etc.) see the main README.md. This file only covers the
KernelCI/KCIDB wiring on top of an already-working AWS pipeline.
The KernelCI poller drives the existing AWS pipeline — it does not
provision AWS resources on its own. Before continuing, make sure the
pipeline can run a job end-to-end. The full walkthrough lives in
README.md; the minimum steps are:
-
Install the package in a venv — Python 3.11 required, see README → Installation:
python3.11 -m venv .venv && source .venv/bin/activate python3.11 -m pip install -e .
-
Configure AWS credentials — see README 1. Configure AWS Credentials. Either
aws configure/ an IAM role, or dropexamples/aws/credentials.jsonin place. -
Generate the pipeline config — see README 2. Configure the project:
kernel-ci-cloud-runner aws setup configure \ --prefix kernel-ci-$USER- --region us-west-2This populates
examples/aws/config.jsonwith unique S3/IAM/ECS/ECR names. Use--output my-config.jsonto write to a different path (then passPULLAB_BASE_CONFIG=my-config.jsonto the poller). -
Verify the pipeline works — see README 3. Run integration test to verify setup:
pytest tests/integration/ -v -m integration
Look for
VMs: 2/2 spawned, 2 successful, 0 failed. If this passes, the AWS side is ready and you can proceed below.For a fast pre-flight check (no VMs spawned) of AWS permissions, the results bucket, and KernelCI/KCIDB tokens, run:
kernel-ci-cloud-runner aws setup validate \ --bucket kernel-ci-$USER-results --role kernel-ci-$USER-vm-role
See README → Validate setup. The
KERNELCI_API_TOKEN/KCIDB_JWT/UNIFIED_TOKENenv vars set in section 3 below are picked up automatically.
If your jobs install custom kernels, also follow README 4. Upload kernel RPMs.
To tear everything down afterwards, see README 7. Clean up resources.
- A working AWS pipeline —
kernel-ci-cloud-runner aws setup configurehas been run andexamples/aws/config.jsonis populated (see section 1 above). - A reachable kernelci-api with at least one pull-lab job scheduled to
pull-labs-aws-ec2(or whatever runtime name you use). - A reachable
kcidb-restd-rs/submitendpoint. - A JWT signed with the kcidb-restd-rs
unified_secretcarrying the origin you'll use for these rows.
Open examples/aws/config.json and edit the kernelci block that was
added alongside test_config:
"kernelci": {
"api_base_uri": "https://staging.kernelci.org:9000/latest",
"api_token": null,
"runtime_name": "pull-labs-aws-ec2",
"poll_interval_sec": 30,
"cursor_file": "/tmp/pullab_cloud_cursor.json",
"kcidb_submit_url":"https://db.kernelci.org/submit",
"kcidb_origin": "pullab_cloud_aws",
"kcidb_jwt": null
}Secrets (api_token, kcidb_jwt) are normally injected via environment
variables, not committed to the file:
| Env var | Falls back to | Purpose |
|---|---|---|
KERNELCI_API_BASE_URI |
kernelci.api_base_uri |
API URL |
KERNELCI_API_TOKEN |
kernelci.api_token |
Bearer token for the API (optional for public endpoints) |
KERNELCI_RUNTIME_NAME |
kernelci.runtime_name |
Lab/runtime to consume jobs for |
KERNELCI_PLATFORMS |
kernelci.platforms |
Comma-separated allowlist of node.data.platform values. Optional; used to split one runtime label across multiple pollers (e.g. arm64 vs x86_64). Omit to accept any platform. |
KCIDB_SUBMIT_URL |
kernelci.kcidb_submit_url |
kcidb-restd-rs submit URL |
KCIDB_JWT |
kernelci.kcidb_jwt |
JWT bearer token |
KCIDB_REST |
(alternative to the two above) | https://<token>@host[/path] — kci-dev-compatible single URL carrying both endpoint and token |
UNIFIED_TOKEN |
(fallback for KERNELCI_API_TOKEN and KCIDB_JWT) |
Single token used for both when the dedicated env vars aren't set. Lower priority than the specific vars, higher than config-file values. |
KCIDB_ORIGIN |
kernelci.kcidb_origin |
Origin string in submitted rows |
PULLAB_CURSOR_FILE |
kernelci.cursor_file |
Where to persist the poll cursor |
PULLAB_POLL_INTERVAL_SEC |
kernelci.poll_interval_sec |
Sleep between empty polls |
PULLAB_BASE_CONFIG |
examples/aws/config.json |
Path to base config |
export KCIDB_JWT="eyJ...your.token..."
make poller-once
# or:
python -m kernel_ci_cloud_labs.pull_labs_poller --config examples/aws/config.json --onceWhat happens:
- Fetches
/events?state=available&kind=job&recursive=true&from=<cursor>. - Skips events whose
node.data.data.runtime≠runtime_name. - For each matching event, downloads
node.artifacts.job_definitionJSON. - Walks
node.parentto find thekbuildancestor and buildsbuild_id = "<kcidb_origin>:<kbuild_node_id>". - Translates the job →
test_config.vms[*]and callsrun_pipeline(). - Submits one tests-only KCIDB revision per job.
- Persists the latest event
timestampto the cursor file.
make poller
# or:
python -m kernel_ci_cloud_labs.pull_labs_poller --config examples/aws/config.jsonSleep interval between polls when there is nothing to do is
PULLAB_POLL_INTERVAL_SEC (default 30s).
The same module exposes lambda_handler(event, context) that runs one
poll cycle per invocation. Wire it to an EventBridge schedule (e.g.
every minute) and set the env vars on the Lambda function. The cursor
file lives on /tmp by default — fine for steady polling, but configure
PULLAB_CURSOR_FILE to a persistent path (or write a custom
CursorStore backed by S3/DynamoDB) if you need true cross-cold-start
deduplication.
Lambda handler entry point:
kernel_ci_cloud_labs.pull_labs_poller.lambda_handler
The poller has no AWS-specific imports at the top level (other than the
default executor which calls into the existing AWS pipeline). For a
custom executor, instantiate PullLabsPoller directly:
from kernel_ci_cloud_labs.pull_labs_poller import PullLabsPoller
poller = PullLabsPoller(config, job_executor=my_executor)
poller.run_forever()my_executor(run_config) -> (test_rows, log_url) is called once per
job; test_rows is a list of {"name": str, "status": str, "duration_ms": Optional[int]} dicts.
- Events reach the poller: run
--oncewith--log-level DEBUGand confirm the poll URL and event count are logged. - Cursor advances: inspect
cat /tmp/pullab_cloud_cursor.jsonafter a cycle. - KCIDB receives rows: if
kcidb-restd-rsis local, check itsspool_directoryfor asubmission-*.json. Open one and confirmtests[*]rows have yourorigin, abuild_idof the form<origin>:<kbuild_node_id>, and statuses in{PASS, FAIL, SKIP, ERROR, MISS, DONE}. - AWS run actually ran: the existing pipeline logs land under
logs/run_*/ands3://<results-bucket>/run_pulllab-*/. - Payload shape (optional): if you have
kci-devinstalled, you can sanity-check our submission shape by capturing one payload (with logging) and piping it through:Our poller speaks the same KCIDB v5.3 schema, so this should round-trip cleanly.kci-dev submit build --from-json <captured.json> --origin <kcidb_origin> --dry-run
| Symptom | Likely cause |
|---|---|
Missing required configuration: kernelci.kcidb_jwt |
env var not set and config value is null |
| Events come back but none are processed | runtime_name mismatch with what the scheduler set; check node.data.data.runtime in a raw event |
Could not resolve build_id warning |
Job node has no kbuild ancestor reachable within 8 hops, or api_token is missing for a protected /node/{id} endpoint |
Translation failed … missing required artifacts.kernel |
The KernelCI build that produced this job did not upload a kernel image |
| HTTP 401 from KCIDB submit | JWT not signed by the kcidb-restd-rs unified_secret, expired, or origin claim mismatch |
| Same job processed repeatedly | Cursor file path not persistent across restarts (Lambda /tmp is ephemeral across cold starts) |
- We do no change yet job state from available. We have deduplication via the cursor, but if the poller restarts it may re-process some events.