Log errors when incorrect commands are used in batch mode by emmuhamm · Pull Request #858 · DUNE-DAQ/drunc

emmuhamm · 2026-03-23T17:11:47Z

Description

Fixes issue #753

Will give feedback in case there is a bad command when inputting as a drunc batch/semibatch mode.

This change adds in a validation check to ensure that the commands parsed by via the batch mode work.

It does it in a few steps. Note the following example with unknown_cmd

Generates a click parser and use that to parse the argv (['boot', 'unknown_cmd', 'start-run', '--run-number', '1', 'wait', '20'])
Using click to obtain the list of available commands that are registered to it, iterate over the supplied list from the command line.
1. If it identifies a command, mark it as a command and move on to the next one
2. If it doesn't identify a command:
  1. but the previous iteration was a command, mark it as an argument
  2. Else, raise error
3. This splits the arguments into a list of command, argument pairs ([('boot', ['unknown_cmd']), ('start-run', ['--run-number', '1']), ('wait', ['20'])])
Using click's make_context, go through each command and argument pair and see if its possible to form a context
If it fails, raise an error (eg. boot unknown_cmd) is not valid command.

This should be robust as it will go through the definitive list of commands as defined from click. It shoulnd't need maintenance as long as people properly tell click of their new commands.

Type of change

New feature / enhancement
Bug fix

List of required branches from other repositories

None

Suggested manual testing checklist

Use the latest nightly and check out this branch
Dont forget to rebase if necessary
Run drunc-unified-shell ssh-standalone config/daqsystemtest/example-configs.data.xml local-1x1-config test
1. This should drop you in the unified shell
Run drunc-unified-shell ssh-standalone config/daqsystemtest/example-configs.data.xml local-1x1-config test boot wait 5 conf
1. This should run the batch mode just fine
Run drunc-unified-shell ssh-standalone config/daqsystemtest/example-configs.data.xml local-1x1-config test unknown_cmd
1. This should throw an exception (original command unknown)
Run drunc-unified-shell ssh-standalone config/daqsystemtest/example-configs.data.xml local-1x1-config test boot sleep 5 conf
1. This should throw an exception (sleep is unknown, so treated as an argument to boot boot sleep. Making this context will fail.)
Run drunc-unified-shell ssh-standalone config/daqsystemtest/example-configs.data.xml local-1x1-config test boot wait 5 conf logs --name ru-controller --user emmuhamm
1. This should run just fine (testing argument parsing)
Run drunc-unified-shell ssh-standalone config/daqsystemtest/example-configs.data.xml local-1x1-config test boot wait 5 conf logs --namEEe ru-controller --user emmuhamm
1. This should raise an exception, unknown optoin --namEEe
Run kurt's command drunc-unified-shell ssh-standalone config/daqsystemtest/example-configs.data.xml local-1x1-config ${USER}-local-test boot wait 5 conf wait 3 start --run-number 142 enable-triggers wait 10 disable-triggers sleep 3 drain-dataflow stop-trigger-sources stop scrap terminate
1. This should throw an exception (you should be able to tell what it is)
Try fixing the above yourself, and see it pass ;)

Developer checklist

Prior to marking this as "Ready for Review"

Tests ran on: np04-srv-028 from release NFD_DEV_260323_A9

Unit tests - some tests can't be ran on the CI. This is documented. If this PR checks a feature that can't be tested with CI, this has been marked appropriately.

Integration tests - the daqsystemtest_integtest_bundle requires a lot of resources, and connections to the EHN1 infrastructure. Check the cross referenced list if you can't run these. The developer needs to run at least the .

Unit tests (pytest --marker) passed
- With relevant marker
- Without marker
Integration tests passed
- Only daqsystemtest_integtest_bundle.sh -k minimal_system_quick_test.py
- Full daqsystemtest_integtest_bundle.sh
Testing skipped as there are no core code changes in this PR, this only relates to documentation/CI workflows

Final checklist prior to marking this as "Ready for Review"

Code is clearly commented.
New unit tests have been added, or is documented in # ISSUE NUMBER
A suitable reviewer has been chosen from this list.

Reviewer checklist

This branch has been rebased with develop prior to testing.
Suggested manual tests show changes.
CI workflows fails documented (if present)
Integration tests passed
- Only concern yourself if failures related to drunc are in the log files
- If non-drunc failure appears:
  - Validate failure in fresh working area
  - Contact Pawel if unsure

Once the features are validated and both the unit and integration tests pass, the PRs is ready to be merged.

Prior to merging

Choose one of the following an complete all substeps

Changes only affect the Run Control, are in a single repository, and do not affect the end user.
- Changes are documented in docstrings and code comments
- Wiki has been updated if architectural or endpoint changes
Otherwise
- Workflow changes demonstrated in the Change Log (if necessary)
- Wiki has been updated (if necessary)
- #daq-sw-librarians Slack channel notified (see below)

Once completed, the reviewer can merge the PR.

Notification message for #daq-sw-librarians Slack channel

For an single merge that changes the user workflow

The CCM WG has an isolated PR ready to merge that affects user workflows. The PR is:

_URL_

I will leave time for any comments, otherwise will merge these at the end of the work day _Insert your time zone_.

For co-ordinated merge

The CCM WG has a set of co-ordinated merges ready to merge. The PRs are:

_URL_

_URL_


I will leave time for any comments, otherwise will merge these at the end of the day.

emmuhamm · 2026-03-24T11:24:24Z

Hey @PawelPlesniak, this is ready for review :)

PawelPlesniak · 2026-03-27T17:44:24Z

This is generally very nice. One small change I would make is in the comments. Otherwise nice work!

PawelPlesniak · 2026-03-27T17:45:07Z

src/drunc/unified_shell/shell.py

+        except click.UsageError as e:
+            raise DruncBatchShellArgError(e) from e
+
+    parser = ctx.command.make_parser(ctx)


This and the following 5 lines should be checkable simply by checking the UnifiedShellMode - if not interactive, this will suffice.

PawelPlesniak · 2026-03-27T17:45:47Z

This has been validated to work as expected. Running the full integration test suite now.

PawelPlesniak · 2026-03-27T18:48:19Z

on np04-srv-029

======================== 18 passed in 440.64s (0:07:20) ========================


+++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++ SUMMARY ++++++++++++++++++++
+++++++++++++++++++++++++++++++++++++++++++++++++

Fri Mar 27 07:35:46 PM CET 2026
Log file is: /tmp/pytest-of-pplesnia/dunedaq_integtest_bundle_20260327183821.log

⮕ Running daqsystemtest/3ru_1df_multirun_test.py ⬅
=================== 2 failed ❌, 4 passed ✅ in 379.11s (0:06:19) ====================
⮕ Running daqsystemtest/3ru_3df_multirun_test.py ⬅
======================== 6 passed ✅ in 342.23s (0:05:42) =========================
⮕ Running daqsystemtest/example_system_test.py ⬅
======================== 12 passed ✅ in 365.43s (0:06:05) ========================
⮕ Running daqsystemtest/fake_data_producer_test.py ⬅
======================== 6 passed ✅ in 299.55s (0:04:59) =========================
⮕ Running daqsystemtest/long_window_readout_test.py ⬅
============================== 1 skipped 🟡 in 1.71s ==============================
⮕ Running daqsystemtest/minimal_system_quick_test.py ⬅
========================= 4 passed ✅ in 79.46s (0:01:19) =========================
⮕ Running daqsystemtest/readout_type_scan_test.py ⬅
======================== 33 passed ✅ in 883.55s (0:14:43) ========================
⮕ Running daqsystemtest/sample_ehn1_multihost_test.py ⬅
======================== 4 skipped 🟡 in 84.71s (0:01:24) =========================
⮕ Running daqsystemtest/small_footprint_quick_test.py ⬅
========================= 3 passed ✅ in 80.90s (0:01:20) =========================
⮕ Running daqsystemtest/tpg_state_collection_test.py ⬅
======================== 5 passed ✅ in 146.34s (0:02:26) =========================
⮕ Running daqsystemtest/tpreplay_test.py ⬅
======================== 6 passed ✅ in 176.35s (0:02:56) =========================
⮕ Running daqsystemtest/tpstream_writing_test.py ⬅
======================== 4 passed ✅ in 146.25s (0:02:26) =========================
⮕ Running daqsystemtest/trigger_bitwords_test.py ⬅
======================== 18 passed ✅ in 440.64s (0:07:20) ========================

PawelPlesniak · 2026-03-27T18:48:32Z

I will make the suggested changes later today and get this merged, nice work!

PawelPlesniak · 2026-03-29T12:14:31Z

I added a small change to update the logic that selects tthe UnifiedShellMode. I think this command needs a little more work - I ran the suggested manual tests and saw the following when running:

drunc-unified-shell ssh-standalone config/daqsystemtest/example-configs.data.xml local-1x1-config test boot sleep 5 conf
[2026/03/29 12:09:10 UTC] INFO       shell.py:180                             drunc.unified_shell                                Setting up to use the process manager with configuration ssh-standalone and configuration id "local-1x1-config" from oksconflibs:config/daqsystemtest/example-configs.data.xml
[2026/03/29 12:09:10 UTC] INFO       shell.py:202                             drunc.unified_shell                                Starting process manager
[2026/03/29 12:09:10 UTC] INFO       process_manager.py:109                   drunc.process_manager                              process_manager communicating through address 10.73.136.71:42475
Extracted batch commands: ['boot', 'sleep', '5', 'conf']
Split batch commands: [('boot', ['sleep', '5']), ('conf', [])]
[2026/03/29 12:09:10 UTC] ERROR      unified_shell.py:15                      drunc.unified_shell                                🔥🔥 Exception thrown 🔥🔥
[2026/03/29 12:09:10 UTC] ERROR      unified_shell.py:16                      drunc.unified_shell                                Caught exception on bad arguments for the batch shell: Got unexpected extra arguments (sleep 5)

This implementation is slightly unclear - when looking at the extra arguments, the way that you have filtered for the arguments (by looking at the chain_args and with the arg not in command_names, this fails to parse when an argument is not in the command list. This, however, is flawed, as it finds sleep as an argument of boot, which is not the case. A better implementation would be to look at the arguments of each command, and validate that the next passed batch mode arguent is neitherr a batch command nor an argument of the current command.

emmuhamm changed the title ~~[TO FIXUP] THIS WORKS~~ Log errors when incorrect commands are used in batch mode Mar 23, 2026

emmuhamm self-assigned this Mar 23, 2026

emmuhamm added the enhancement New feature or request label Mar 23, 2026

emmuhamm added 2 commits March 24, 2026 11:40

Add parser check for batch commands

9bee6e0

Extract validate to own function

c60cf99

emmuhamm force-pushed the emmuhamm/bad-commands-batch-fix branch from 93f48b0 to c60cf99 Compare March 24, 2026 10:40

Minor cleanup on excepts

51d49ac

emmuhamm linked an issue Mar 24, 2026 that may be closed by this pull request

[Feature]: it would be helpful to get more feedback about bad commands in drunc batch mode #753

Open

emmuhamm requested a review from PawelPlesniak March 24, 2026 11:23

emmuhamm marked this pull request as ready for review March 24, 2026 11:24

emmuhamm mentioned this pull request Mar 24, 2026

[Feature]: it would be helpful to get more feedback about bad commands in drunc batch mode #753

Open

PawelPlesniak reviewed Mar 27, 2026

View reviewed changes

updating for use with the unified_shell_mode

078e245

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Log errors when incorrect commands are used in batch mode#858

Log errors when incorrect commands are used in batch mode#858
emmuhamm wants to merge 4 commits intodevelopfrom
emmuhamm/bad-commands-batch-fix

emmuhamm commented Mar 23, 2026 •

edited by PawelPlesniak

Loading

Uh oh!

emmuhamm commented Mar 24, 2026

Uh oh!

PawelPlesniak commented Mar 27, 2026

Uh oh!

PawelPlesniak Mar 27, 2026

Uh oh!

PawelPlesniak commented Mar 27, 2026

Uh oh!

PawelPlesniak commented Mar 27, 2026

Uh oh!

PawelPlesniak commented Mar 27, 2026

Uh oh!

PawelPlesniak commented Mar 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

emmuhamm commented Mar 23, 2026 • edited by PawelPlesniak Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

List of required branches from other repositories

Suggested manual testing checklist

Developer checklist

Prior to marking this as "Ready for Review"

Final checklist prior to marking this as "Ready for Review"

Reviewer checklist

Prior to merging

Notification message for #daq-sw-librarians Slack channel

For an single merge that changes the user workflow

For co-ordinated merge

Uh oh!

emmuhamm commented Mar 24, 2026

Uh oh!

PawelPlesniak commented Mar 27, 2026

Uh oh!

PawelPlesniak Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

PawelPlesniak commented Mar 27, 2026

Uh oh!

PawelPlesniak commented Mar 27, 2026

Uh oh!

PawelPlesniak commented Mar 27, 2026

Uh oh!

PawelPlesniak commented Mar 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

emmuhamm commented Mar 23, 2026 •

edited by PawelPlesniak

Loading