OCPBUGS-93863: E2E: Fix tuned active profile check for wrapped performance profiles#1549
OCPBUGS-93863: E2E: Fix tuned active profile check for wrapped performance profiles#1549mrniranjan wants to merge 2 commits into
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughThe performanceprofile e2e test now reads the active Tuned profile from node status, parses Tuned ChangesTuned Profile Verification Updates
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Important Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional. ❌ Failed checks (1 error)
✅ Passed checks (14 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/e2e/performanceprofile/functests/1_performance/performance.go`:
- Around line 1430-1439: The loop that returns the first profile containing
include=<performanceProfileName> does not account for Tuned recommendation
semantics and can return an incorrect profile when multiple Tuned CRs include
the performance profile. Instead of returning on the first include= match found
via strings.Contains, resolve the wrapper profile by consulting Tuned's
recommendation selection logic (priority and match criteria) to identify the
active/effective recommended profile. Alternatively, detect when multiple
candidate profiles exist that contain the include match and fail fast with a
clear error rather than returning an ambiguous result.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: ee8fdbda-b6f7-416d-a49b-3d1cfb261e56
📒 Files selected for processing (1)
test/e2e/performanceprofile/functests/1_performance/performance.go
|
/retest-required |
1 similar comment
|
/retest-required |
There was a problem hiding this comment.
♻️ Duplicate comments (1)
test/e2e/performanceprofile/functests/1_performance/performance.go (1)
1488-1504: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick winInclude-chain walk only follows the first branch; misses targets reachable via other includes.
The direct-include check at Line 1480-1486 inspects all
includedProfiles, but the next-hop selection at Line 1488-1495 picks only the first include that exists inincludesMapand commits to that single path. If a profile includes multiple parents (e.g.wrapperincludes[other, mid]and onlymidtransitively includes the performance profile), the walk descends intoother, hits a dead end, and returnsfalse— a false negative that fails the test even though the active profile does trace back to the performance profile. This should explore all branches (BFS/DFS) rather than a single linear chain.♻️ Proposed traversal over all branches
- // Walk the include chain, starting from startProfile - current := startProfile - visited := make(map[string]bool) // Prevent infinite loops - maxHops := len(includesMap) - chain := []string{startProfile} // Track the chain for logging - - for hop := 0; hop < maxHops; hop++ { - // Found it? - if current == targetProfile { - testlog.Infof("Profile chain validated: %s", strings.Join(chain, " → ")) - return true - } - - // Already visited? (cycle detection) - if visited[current] { - testlog.Warningf("Cycle detected in profile chain at %q", current) - return false - } - visited[current] = true - - // What does the current profile include? - includedProfiles, exists := includesMap[current] - if !exists { - testlog.Warningf("Profile %q not found in cluster", current) - return false // Profile not found - } - - // Check if it directly includes the target - for _, included := range includedProfiles { - if included == targetProfile { - chain = append(chain, targetProfile) - testlog.Infof("Profile chain validated: %s", strings.Join(chain, " → ")) - return true - } - } - - // Find the next profile in the chain (first include that exists in our map) - nextProfile := "" - for _, included := range includedProfiles { - if _, exists := includesMap[included]; exists { - nextProfile = included - break - } - } - - if nextProfile == "" { - testlog.Warningf("Dead end in profile chain at %q (no valid includes found)", current) - return false // Dead end - no more includes - } - - // Move to the next profile in the chain - chain = append(chain, nextProfile) - current = nextProfile - } - - // Shouldn't reach here unless chain is unreasonably deep - testlog.Warningf("Profile chain exceeded maximum depth of %d hops", maxHops) - return false + // DFS over the whole include graph (all branches), with cycle detection. + visited := make(map[string]bool) + var reachable func(current string) bool + reachable = func(current string) bool { + if current == targetProfile { + return true + } + if visited[current] { + return false + } + visited[current] = true + for _, included := range includesMap[current] { + if reachable(included) { + return true + } + } + return false + } + return reachable(startProfile)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/e2e/performanceprofile/functests/1_performance/performance.go` around lines 1488 - 1504, The include-chain traversal in the performance profile check is too greedy: the next-hop logic in the chain walk only follows the first matching entry from includedProfiles, which can miss valid ancestors reachable through other includes. Update the walk in the profile-chain helper to explore all branches from each profile (for example with DFS or BFS and a visited set) instead of committing to a single nextProfile, so profiles like wrapper can still resolve through mid even if another include is a dead end.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@test/e2e/performanceprofile/functests/1_performance/performance.go`:
- Around line 1488-1504: The include-chain traversal in the performance profile
check is too greedy: the next-hop logic in the chain walk only follows the first
matching entry from includedProfiles, which can miss valid ancestors reachable
through other includes. Update the walk in the profile-chain helper to explore
all branches from each profile (for example with DFS or BFS and a visited set)
instead of committing to a single nextProfile, so profiles like wrapper can
still resolve through mid even if another include is a dead end.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: b3645597-1666-42ee-85ed-6805f360cd7f
📒 Files selected for processing (1)
test/e2e/performanceprofile/functests/1_performance/performance.go
1d3bed3 to
b578ae8
Compare
Test 37127 read /etc/tuned/active_profile on the node and expected the performance profile name. That fails when a higher-priority Tuned CR wraps the PP via include=<pp-profile> (e.g. ran-du-performance), because the active profile name is the wrapper, not the PP name. Detect wrapper profiles by scanning Tuned CRs for include=<pp-profile> and assert Profile.status.tunedProfile via the API instead of pod exec. Traverse the tuned profile chain In this commit we are traversing the Tuned profile that contains includes= and verifying if we can traverse back to Performance Profile AI Attribution: AIA Human-AI blend, New content, Human-initiated, Reviewed, Cursor 3.0.16 v1.0 Signed-off-by: Niranjan M.R <mniranja@redhat.com>
b578ae8 to
0a32b7d
Compare
|
/retest-required |
|
/test e2e-aws-ovn-techpreview |
1 similar comment
|
/test e2e-aws-ovn-techpreview |
| func extractIncludedProfiles(profileData string) []string { | ||
| cfg, err := ini.Load([]byte(profileData)) | ||
| if err != nil || cfg == nil { | ||
| // Invalid INI data - return empty list |
There was a problem hiding this comment.
Add log here to make it more clear at what part in the flow the code failed
There was a problem hiding this comment.
fixed in latest commit
|
|
||
| // Check if the [main] section has an "include" key | ||
| if !cfg.Section("main").HasKey("include") { | ||
| return []string{} |
There was a problem hiding this comment.
fixed in latest commit
| // Get the include value | ||
| includeValue := cfg.Section("main").Key("include").String() | ||
| if includeValue == "" { | ||
| return []string{} |
There was a problem hiding this comment.
fixed in latest commit
| // Check if this node's active profile traces back to the performance profile | ||
| tracesToPerformanceProfile := profileIncludesTarget(activeProfile, performanceProfileName, tunedList) | ||
|
|
||
| g.Expect(tracesToPerformanceProfile).To(BeTrue(), |
There was a problem hiding this comment.
true or false gives us minimum information about the progress of profileIncludesTarget function.
IOW, in case that it return false we have no idea if the node's active profile is not tracing back to a performance profile or something failed down the stack during the execution.
I would say it's is better to also return an error that elaborate in case of failure.
There was a problem hiding this comment.
fixed in latest commit
|
|
||
| // Get all Tuned profiles once (for lookup) | ||
| tunedList := &tunedv1.TunedList{} | ||
| Expect(testclient.ControlPlaneClient.List(ctx, tunedList)).To(Succeed(), |
There was a problem hiding this comment.
tuned objects are on the DataPlaneClient on Hypershift.
There was a problem hiding this comment.
fixed in latest commit
- Fix Hypershift compatibility: Use DataPlaneClient for listing Tuned CRs On Hypershift clusters, tuned objects are on the data plane, not control plane. - Add error handling: Change function signatures to return (bool, error) Return errors for unexpected conditions (cycles, missing profiles, parsing failures) while returning (false, nil) for valid "not found" cases. Error cases: * Cycle detected in profile chain * Profile referenced but not found in cluster * Chain exceeded maximum depth * INI parsing failed Valid "not found" case: * Profile chain ends without reaching target (false, nil) - Add logging for better observability: * Log when INI parsing fails with the error details * Log when profiles have no 'include' key * Log when 'include' value is empty * Changed dead-end from Warning to Info (it's a normal case)
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: mrniranjan, Tal-or The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
@mrniranjan: This pull request references Jira Issue OCPBUGS-93863, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/verified by @mrniranjan |
|
@mrniranjan: This PR has been marked as verified by DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@mrniranjan: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Test 37127 read /etc/tuned/active_profile on the node and expected the performance profile name. That fails when a higher-priority Tuned CR wraps the PP via include= (e.g. ran-du-performance), because the active profile name is the wrapper, not the PP name.
Detect wrapper profiles by scanning Tuned CRs for include= and assert Profile.status.tunedProfile via the API instead of pod exec.
Summary by CodeRabbit
includechains to confirm the active profile reaches the expected performance profile, including cycle detection and bounded traversal.includetargets by parsing profile content as INI and ignoring non-effective entries.