Skip to content

Flaky E2E: paginated exporter listing fails with etcd request timeout #596

@raballew

Description

@raballew

Summary

The E2E test "paginated exporter listing returns all exporters" (e2e/test/e2e_test.go:343) is flaky. It fails intermittently with an etcd timeout when creating many exporters for pagination testing.

Failure Details

  • Test: Core E2E Tests > Lease operations > paginated exporter listing returns all exporters
  • File: e2e/test/e2e_test.go:343
  • Duration before failure: 2m52s (likely hit a timeout)
  • Root cause: Kubernetes API returns HTTP 500 while creating exporter pagination-exp-93:
    ApiException: (500)
    Reason: Internal Server Error
    HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"etcdserver: request timed out","code":500}
    

Observed In

Analysis

The test creates ~100 exporters sequentially to test pagination. Under CI resource constraints, the kind cluster's etcd can timeout around exporter 93 of 100. This is an infrastructure/resource pressure issue, not a code bug.

Possible Mitigations

  • Reduce the number of exporters created (e.g., from 100 to 25, with a smaller page size)
  • Add retry logic around exporter creation in the test
  • Add a brief sleep between creation batches to reduce etcd pressure
  • Create exporters in parallel with a semaphore to control concurrency

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions