net: STD for dual-stream RHCOS 9/10 node migration network tests#4569
net: STD for dual-stream RHCOS 9/10 node migration network tests#4569azhivovk wants to merge 1 commit intoRedHatQE:mainfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughFour new pytest modules are added to define connectivity regression test skeletons for live migration scenarios between RHCOS 9 and RHCOS 10 worker nodes across four network types (L2 bridge, localnet, primary network, and User Defined Network). Each module contains two Polarion-marked test functions with descriptive docstrings but no executable test logic. ChangesNetwork Connectivity Regression Test Skeletons
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Reasoning: Four nearly identical modules with high pattern repetition. Review involves verifying Polarion marker IDs are distinct and correct, confirming docstrings accurately describe each network type's migration scenario, checking 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Report bugs in Issues Welcome! 🎉This pull request will be automatically processed with the following features: 🔄 Automatic Actions
📋 Available CommandsPR Status Management
Review & Approval
Testing & Validation
Container Operations
Cherry-pick Operations
Label Management
✅ Merge RequirementsThis PR will be automatically approved when the following conditions are met:
📊 Review ProcessApprovers and ReviewersApprovers:
Reviewers:
Available Labels
AI Features
💡 Tips
For more information, please refer to the project documentation or contact the maintainers. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@tests/network/l2_bridge/mixed_nodes/test_primary_network_connectivity_mixed_nodes.py`:
- Around line 19-45: The Polarion marker value is duplicated (CNV-0) in the test
function
test_connectivity_over_primary_network_preserved_during_source_migration (and
across other mixed_nodes test modules), causing ambiguous test attribution;
update the pytest.mark.polarion decorator on this function to a unique
placeholder ID (e.g., CNV-0A) and ensure the other mixed_nodes modules
(functions with the same decorator such as in
test_l2_bridge_connectivity_mixed_nodes.py,
test_localnet_connectivity_mixed_nodes.py, test_udn_connectivity_mixed_nodes.py)
each get distinct placeholders (CNV-0B/C/D) or their final real Polarion IDs so
each test function has a unique polarion marker.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 695c7508-c3b6-4e04-aa6d-4fd9fd030b7e
📒 Files selected for processing (7)
tests/network/l2_bridge/mixed_nodes/__init__.pytests/network/l2_bridge/mixed_nodes/test_l2_bridge_connectivity_mixed_nodes.pytests/network/l2_bridge/mixed_nodes/test_primary_network_connectivity_mixed_nodes.pytests/network/localnet/mixed_nodes/__init__.pytests/network/localnet/mixed_nodes/test_localnet_connectivity_mixed_nodes.pytests/network/user_defined_network/mixed_nodes/__init__.pytests/network/user_defined_network/mixed_nodes/test_udn_connectivity_mixed_nodes.py
|
D/S test |
|
Change: remove client migration scenarios |
|
|
Change: rename tests without mentioning node OS |
|
Change: apply bot's suggestion - primary interface test |
|
Change: rename modules |
nirdothan
left a comment
There was a problem hiding this comment.
IMHO it would have been beneficial to indicate that migration between OSs would be accomplished using node selectors targeting the os label, as is not exactly a standard procedure.
I guess it can be inferred from the preconditions "... VM connected to a Linux bridge network, running on an RHCOS 9 worker node via node selector".
Thus if anyone asks why the implementation did not use an anti-affinity rule you can reference the decision to here.
|
/verified |
Test execution plan request detailsCRITICAL: You MUST post an inline review comment on the first changed line of the first file. As an expert software testing engineer, analyze all modified files in this PR and create a targeted test execution plan. Analysis Requirements:
Your deliverable: Test Execution Plan
Guidelines:
Hardware-Related Checks (SR-IOV, GPU, DPDK): When PR modifies fixtures for hardware-specific resources:
CRITICAL WORKFLOW COMPLETION RULES: When responding to this test execution plan request, you MUST follow these rules EXACTLY:
ABSOLUTE PROHIBITIONS (violating these creates empty/meaningless reviews):
Remember: The pull request review is visible to users. Additional comments are redundant noise. CRITICAL: Before setting "Run smoke tests: True", you MUST use shell scripts to: |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 1785 🏁 Script executed: Length of output: 517 🏁 Script executed: Length of output: 138 🏁 Script executed: Length of output: 110 🏁 Script executed: Length of output: 1735 🧠 Learnings used |
EdDev
left a comment
There was a problem hiding this comment.
I commented only on the first test, but it relates to all the others.
|
|
||
|
|
||
| @pytest.mark.polarion("CNV-15949") | ||
| def test_connectivity_preserved_during_server_migration(): |
There was a problem hiding this comment.
I do not understand if it is the primary or secondary checked here.
Please add details about the binding used.
There was a problem hiding this comment.
I was asked not to mention the setup twice if we can understand it from the package name.
Binding is implied by the test package - l2_bridge, and in primary network tests - primary_network.
If I add Linux bridge/primary, the name becomes too long #4569 (comment)
I edited it but I get conflicting reviews regarding the name.
There was a problem hiding this comment.
it lives under l2_bridge, how it can be primary?
There was a problem hiding this comment.
I guess there is a problem with the definitions, and the current package name does not help.
I suggest not to assume specific combinations, as these are changing all the time.
We have several classifications:
- Primary & Secondary.
- Binding: The wiring between the pod and the VM. Masquerade, Bridge, Passt, l2bridge plugin, etc.
- CNI: The wiring between the node and the pod: Linux Bridge, OVN-K Localnet, OVN-K layer 2 overlay, SR-IOV, etc.
We have only specific supported permutations of the above.
However, we also have exceptions, e.g. hypershift is using primary with bridge binding (and not masquerade).
So tell me, what l2_bridge refers to?
I edited it but I get conflicting reviews regarding the name.
You need to mitigate it and assist reach agreement between reviewers.
I am fine if it is clarified in text at the module head what this is focusing on and removing it from the individual tests. But the real solution is to rename and re-organize the test hirarchy.
|
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@tests/network/user_defined_network/rhel9_rhel10_cluster/test_connectivity.py`:
- Around line 17-51: Rename the two test functions that collide with
primary-network tests to include UDN context: change
test_primary_connectivity_preserved_after_server_migration_to_rhcos10 and
test_primary_connectivity_preserved_after_server_migration_to_rhcos9 to unique
names such as
test_udn_primary_connectivity_preserved_after_server_migration_to_rhcos10 and
test_udn_primary_connectivity_preserved_after_server_migration_to_rhcos9 (or
similar including "udn" or "user_defined_network") in this file so pytest -k
filtering and reports no longer collide with the primary-network variants;
update any local references or markers (e.g., the `@pytest.mark.polarion`
decorators) accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: e65123af-e2bb-41b9-a9e7-3df92d1abe20
📒 Files selected for processing (9)
tests/network/l2_bridge/rhel9_rhel10_cluster/__init__.pytests/network/l2_bridge/rhel9_rhel10_cluster/test_connectivity.pytests/network/localnet/rhel9_rhel10_cluster/__init__.pytests/network/localnet/rhel9_rhel10_cluster/test_connectivity.pytests/network/primary_network/__init__.pytests/network/primary_network/rhel9_rhel10_cluster/__init__.pytests/network/primary_network/rhel9_rhel10_cluster/test_connectivity.pytests/network/user_defined_network/rhel9_rhel10_cluster/__init__.pytests/network/user_defined_network/rhel9_rhel10_cluster/test_connectivity.py
EdDev
left a comment
There was a problem hiding this comment.
TCP connectivity over * is preserved
"Connectivity" may mean that it is just possible to re-establish a new connection.
Please make sure you are differentiate between that and the seamless TCP connection. The later means the TCP connection is not disconnected.
Makes sense to check that for the secondary interfaces and the UDN ones use seamless while the primary one that uses masquerade is just re-connection.
It is not clear from the tests this point, please clarify it.
Regarding discussion about the test naming with its details, please mitigate to an agreement. I just want it to be clear what is used, even if it is just in the text.
|
@azhivovk , what about the "negative" tests? You dropped them intentionally? |
|
| https://github.com/RedHatQE/openshift-virtualization-tests-design-docs/blob/main/stps/sig-virt/dual-stream-cluster-rhcos9-rhcos10/network.md | ||
|
|
||
| Markers: | ||
| - rhcos9_rhcos10_nodes |
There was a problem hiding this comment.
keep in mind that this is very specific and should be more generic for the next time we need to test a dual strem cluster
how about
mixed_os_nodes / dual_rhcos_nodes / mixed_os_nodes ?
There was a problem hiding this comment.
What next time? This has been a temp thing that is expected to be removed in two versions.
I am not familiar with any requirements to support different OS nodes in the future.
There was a problem hiding this comment.
What next time? This has been a temp thing that is expected to be removed in two versions. I am not familiar with any requirements to support different OS nodes in the future.
it will come in the next os bump (like we had with 8 and 9)
There was a problem hiding this comment.
I expect it to happen once in a decade, but if you want to be on the safe side, fine.
mixed_os_nodes is fine by me.
There was a problem hiding this comment.
Renamed to mixed_os_nodes
EdDev
left a comment
There was a problem hiding this comment.
Looks good.
There are some small fixes to the test naming, see inline.
There is one point that I am unsure about, I would like @servolkov to share his thoughts on it as well:
There are now 4 pair of tests, 8 in total, each raising 2 VMs.
I think we can easily make the pairs share the same VMs and save a few running time.
WDYT?
STD for network connectivity tests on clusters with mixed RHCOS 9 and RHCOS 10 worker nodes. Tests cover four network types (Linux bridge, localnet, primary UDN, primary network) verifying connectivity is preserved during live migration of the server VM. Server VM migration is considered the more impactful scenario as it tests connectivity preservation from the endpoint holding the listening socket. Both migration directions are covered within other existing tests. Each network type has its own module with shared setup (NNCP, NAD) in the module-level preconditions. Signed-off-by: Asia Khromov <azhivovk@redhat.com> Assisted-by: Claude Sonnet 4.6
|
EdDev
left a comment
There was a problem hiding this comment.
Thank you!
You can resolve all my comment threads.
/approve
| __test__ = False | ||
|
|
||
|
|
||
| @pytest.mark.incremental |
There was a problem hiding this comment.
any reason to use incremental here?
the 2 flows are unrelated, the only reason iiuc that incremental is used here is to use the VM from the 1st test in the 2nd test and this is a misuse of incremental.
it would have made sense if the 2nd scenario was about double migration 9- > 10 -> 9.
if you have a bug in scneario 1, you are not testing scenario 2 and will miss bugs in that scenario.
what about a scenario where both VMs run on 10 and one is migrated to 9?
There was a problem hiding this comment.
You are right about the misuse of incremental - since the scenarios are not related but they share class scoped VM which migrates.
The main purpose of using incremental is to use the module scoped VM which has migrated to RHCOS10 node and we won't need to create a new one in the second test. This could save us a lot of time for each test (~5 minutes to create each VM).
Or we can use parametrize so each test is creating a VM on the under test node and migrates to the other. But again this will increment test time.
There was a problem hiding this comment.
It was agreed with @servolkov and me and in order to avoid starting new VMs, the two scenarios are left dependent to utilize the same VMs.
As I see it, once one direction of the migration fails, the other direction (if it comes after) is less important. We need both direction to succeed in order to declare the user-story a pass.
There was a problem hiding this comment.
tying the 2, unrelated tests just for the sake of creating another VM is wrong in my opinion.
i will not approve this but it is under network so please get an lgtm from sergei on this and once ready to merge, i will merge
STD for network connectivity tests on clusters with mixed RHCOS 9 and RHCOS 10 worker nodes.
Tests cover four network types (Linux bridge, localnet, primary UDN, primary network) verifying connectivity is preserved during live migration of the server VM.
Server VM migration is considered the more impactful scenario as it tests connectivity preservation from the endpoint holding the listening socket.
Both migration directions are covered within other existing tests.
Each network type has its own module with shared setup (NNCP, NAD) in the module-level preconditions.
jira-ticket: https://redhat.atlassian.net/browse/CNV-81981
Summary by CodeRabbit