Reduce Github GetTektonDir API requests from 3 requests to 1 request#2721
Reduce Github GetTektonDir API requests from 3 requests to 1 request#2721aThorp96 wants to merge 2 commits into
Conversation
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2721 +/- ##
==========================================
+ Coverage 59.25% 59.31% +0.05%
==========================================
Files 208 208
Lines 20573 20507 -66
==========================================
- Hits 12191 12163 -28
+ Misses 7610 7582 -28
+ Partials 772 762 -10 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Code Review
This pull request refactors the GitHub provider to fetch the .tekton directory using a single GraphQL query, reducing the number of API calls. Review feedback identifies several critical issues: the YAML concatenation logic is non-deterministic due to map iteration, a potential nil pointer dereference exists in the GraphQL client, and the new implementation lacks support for recursive directory structures. Additionally, the mock implementation for tests was found to be inconsistent with the actual GitHub API regarding path handling.
| query = `query($owner: String!, $name: String!, $tektonExpr: String!) { | ||
| repository(owner: $owner, name: $name) { | ||
| %s | ||
| tektonTree: object(expression: $tektonExpr) { | ||
| ... on Tree { | ||
| entries { | ||
| name | ||
| type | ||
| path | ||
| oid | ||
| object { | ||
| ... on Blob { | ||
| text | ||
| } | ||
| } | ||
| } | ||
| } | ||
| } | ||
| } | ||
| }`, strings.Join(aliases, "\n")) | ||
| }` |
There was a problem hiding this comment.
This change introduces a regression in functionality by losing support for recursive directory structures. The new GraphQL query only fetches the immediate entries of the .tekton directory. The previous REST-based implementation used recursive=true, allowing users to organize their Tekton resources in subdirectories (e.g., .tekton/pull-requests/). With this change, any YAML files located in subfolders will be silently ignored.
References
- Repository pipeline files are conventionally located in the .tekton/ subdirectory, and the system should support the hierarchical structure within this directory.
There was a problem hiding this comment.
This is a bug and not something I caught during testing. Marking this PR as a Draft to prevent merging this regression
| pathInTekton := strings.TrimPrefix(relPath, tektonPath+"/") | ||
| if pathInTekton == "" { | ||
| continue | ||
| } | ||
|
|
||
| // Read file content | ||
| content, err := os.ReadFile(fileInfo.name) | ||
| if err != nil { | ||
| continue | ||
| } | ||
|
|
||
| // Only include .yaml and .yml files (matching implementation behavior) | ||
| if !strings.HasSuffix(relPath, ".yaml") && !strings.HasSuffix(relPath, ".yml") { | ||
| continue | ||
| } | ||
|
|
||
| entry := map[string]any{ | ||
| "name": filepath.Base(pathInTekton), | ||
| "type": "blob", | ||
| "path": pathInTekton, |
There was a problem hiding this comment.
The mock implementation of handleTektonTreeQuery is inconsistent with the real GitHub GraphQL API. The mock returns paths relative to the .tekton directory (e.g., pipeline.yaml), whereas the real GitHub API returns the full path from the repository root (e.g., .tekton/pipeline.yaml) for TreeEntry.path. This discrepancy can cause tests to pass with incorrect assumptions about path formats, which may lead to failures in production or when running against real GitHub instances, especially in error reporting or path matching logic.
References
- Repository pipeline files are identified by their location in a subdirectory (e.g., .tekton/), and path handling should consistently use the full repository path to maintain this convention.
Fetch .tekton directory tree with inline blob contents in single GraphQL query instead of 3 sequential API calls. Since processing any supported event requires fetching the tekton directory, GetTektonDir is a hot path and API calls can add up before without any PipelineRuns executing. Reducing the load in this hot path has an outsized impact. In some cases I saw, this change would have eliminated over 1000 API calls in an hour for a single repository. Assisted-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Delete 7 test cases that duplicate coverage already provided by SetupGitTree integration tests. The SetupGitTree helper automatically constructs GraphQL responses from testdata directories, making manual GraphQL response construction redundant for happy path testing. Deleted from TestFetchTektonDirGraphQL (6 cases): - successful fetch with yaml files - no tekton directory - tekton path is file not directory - mixed yaml and yml extensions - subdirectories with yaml files - empty tekton directory Deleted from TestGetTektonDirGraphQL (1 case): - invalid yaml validation failure (duplicate of existing test) Kept only error cases that cannot be represented via filesystem: - http error, graphql errors, null blob content This reduces test code by 213 lines (34% reduction) while maintaining full coverage through integration tests. Assisted-By: Claude Sonnet 4.5 <noreply@anthropic.com>
b754b3d to
20b9c5d
Compare
|
/hold |
theakshaypant
left a comment
There was a problem hiding this comment.
Haven’t gone through the changes properly yet, just sharing thoughts based on the description.
IMO, the title might be a bit misleading for the actual problem this PR is trying to solve (per my understanding: rate limiting). While this reduces the number of requests to 1, it’s a GraphQL request, and GraphQL rate limits are calculated differently from REST requests. Ref
Even this single reduced GraphQL request could end up consuming more points than the REST equivalent. Would love for that last statement to be wrong, but I think it would be worth comparing how optimised this single request is from a rate-limit consumption perspective across a few scenarios:
- no
.tektondir - 3–4
PipelineRunsin.tekton - > 100 manifests in the dir
WDYT @aThorp96? Unless I’ve completely misunderstood the description and “API requests” was referring specifically to request count limits when you mentioned:
In some cases I saw, this change would have eliminated over 1000 API calls in an hour for a single repository.
Adding another reference for REST vs GraphQL rate limits here
📝 Description of the Change
This PR updates Github's GetTektonDir method to use a single GraphQL call to fetch the tekton directory provenance and content, reducing the API request volume from 3 requests to 1 request.
Since processing any supported event requires fetching the tekton directory, GetTektonDir is a hot path and API calls can add up without any PipelineRuns executing. Reducing the load in this hot path has an outsized impact. In some cases I saw, this change would have eliminated over 1000 API calls in an hour for a single repository.
Before this change, fetching the Tekton directory made the following API requests:
After this change, fetching the tekton directory makes the following API request(s):
🔗 Linked GitHub Issue
Fixes #
🧪 Testing Strategy
🤖 AI Assistance
AI assistance can be used for various tasks, such as code generation,
documentation, or testing.
Please indicate whether you have used AI assistance
for this PR and provide details if applicable.
Important
Slop will be simply rejected, if you are using AI assistance you need to make sure you
understand the code generated and that it meets the project's standards. you
need at least know how to run the code and deploy it (if needed). See
startpaac to make it easy
to deploy and test your code changes.
If the majority of the code in this PR was generated by an AI, please add a
Co-authored-bytrailer to your commit message.For example:
Co-authored-by: Claude noreply@anthropic.com
✅ Submitter Checklist
fix:,feat:) matches the "Type of Change" I selected above.make testandmake lintlocally to check for and fix anyissues. For an efficient workflow, I have considered installing
pre-commit and running
pre-commit installtoautomate these checks.