Skip to content

Change to use worst Test Step result as the Test Case result#317

Open
brasmusson wants to merge 3 commits intomainfrom
feature/runner-fixes
Open

Change to use worst Test Step result as the Test Case result#317
brasmusson wants to merge 3 commits intomainfrom
feature/runner-fixes

Conversation

@brasmusson
Copy link
Copy Markdown
Contributor

@brasmusson brasmusson commented Mar 19, 2026

Description

Change to use worst Test Step result as the Test Case result

When calculating the Test Case result Cucumber-Ruby-Core should conform with the GetWorstTestCaseResult function used in the Messages module.

Type of change

Please delete options that are not relevant.

  • New feature (non-breaking change which adds new behaviour)

Please add an entry to the relevant section of CHANGELOG.md as part of this pull request.

Note to other contributors

If your change may impact future contributors, explain it here, and remember to update README.md and CONTRIBUTING.md accordingly.

Checklist:

Your PR is ready for review once the following checklist is
complete. You can also add some checks if you want to.

  • Tests have been added for any changes to behaviour of the code
  • New and existing tests are passing locally and on CI
  • bundle exec rubocop reports no offenses
  • CHANGELOG.md has been updated

When calculating the Test Case result Cucumber-Ruby-Core should
conform with the GetWorstTestCaseResult function used in the
Messages module.
@luke-hill
Copy link
Copy Markdown
Contributor

Haven't got round to fully reviewing but we have this concept in messages
https://github.com/cucumber/messages/blob/main/ruby/lib/cucumber/messages/helpers/test_step_result_comparator.rb

Use test_step_result_rankings from
Cucumber::Messages::Helpers::TestStepResultComparator to compare
test results to find the worst one.
@brasmusson
Copy link
Copy Markdown
Contributor Author

@luke-hill Yes test_step_result_rankings from Messages can be used. It does raises the question about 'Cucumber::Core::Test::Result::TYPES`, its order control the order in the summary. Shouldn't that order conform to the ranking in Messages (which it does after the changes in this PR).

@luke-hill
Copy link
Copy Markdown
Contributor

@luke-hill Yes test_step_result_rankings from Messages can be used. It does raises the question about 'Cucumber::Core::Test::Result::TYPES`, its order control the order in the summary. Shouldn't that order conform to the ranking in Messages (which it does after the changes in this PR).

I think it does, plus the flaky bit. So I don't mind which way we do things tbh.

There's lots of refactors I want to do once cucumber-ruby v11 is released (Just waiting on fixing up last CCK bits).

Copy link
Copy Markdown
Contributor

@luke-hill luke-hill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I ask that without this change what is currently happening / going wrong.

Also should this be included as a minimum for cucumber v11? I'm soon about to cut it and want to get everything sewn up at the same time ideally

Comment thread lib/cucumber/core/test/result.rb
Comment thread lib/cucumber/core/test/runner.rb
step_result.with_duration(duration)
result = result.with_message(%(Undefined step: "#{test_step.text}")) if result.undefined?
result = result.with_appended_backtrace(test_step) unless test_step.hook?
result.describe_to(monitor, result)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the bit that does the magic stuff? I know these methods are a bit obscure for me

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The important here is that the only difference between NotPassing.execute and Base.execute is whether skip or execute is called on the test_step. Previously this method was written under the assumption that when the runner has entered "skip mode" the only possible results are skipped and undefined. This is not true. Hooks can result in passing or failing results. Ambiguous steps result in ambiguous results. pending results is very unlikely be who knows, someone might write a after hook the raises a Pending exception.

After calling skip on the test_step, NotPassing.execute shall be the same as Base.execute, that is:

  • Add a message with the step text to undefined results
  • Add any backtrace from the test_step to the result unless it is a hook
  • Pass is (let it describe itself to the RunningTestCase so it can update the result for the test case as a whole if applicable. The line result.describe_to(monitor, result) will in the end either run RunningTestCase.passed or RunningTestCase.not_passing (depending what type of result that have occurred so far).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this "repeated" rewriting / over assignment is possible because a lot of the describe and other methods are returning self

@brasmusson
Copy link
Copy Markdown
Contributor Author

Can I ask that without this change what is currently happening / going wrong.

Also should this be included as a minimum for cucumber v11? I'm soon about to cut it and want to get everything sewn up at the same time ideally

Without this if a scenario has only an undefined step and an after hook fails, the summary report will say

1 scenario (1 undefined)
1 step (1 undefined)

instead of the expected:

1 scenario (1 failed)
1 step (1 undefined)

Without this if a scenario has one undefined step (or pending) followed by an ambiguous step, the summary report will say

1 scenario (1 undefined)
2 steps (1 ambiguous, 1 undefined)

instead of the expected:

1 scenario (1 ambiguous)
2 steps (1 ambiguous, 1 undefined)

@luke-hill
Copy link
Copy Markdown
Contributor

Ooooh ok. This might actually help me then with some of the work I need to do for messages and @davidjgoss will hopefully be able to help with this.

We have 3 CCK examples currently failing, two of which are regarding hooks failing.

Also are these only specific to the summary report. Or any report for that matter (I'm thinking of messages predominantly)

Also what would happen if it were Before Hooks that failed. Or BeforeAll hooks

@brasmusson
Copy link
Copy Markdown
Contributor Author

It is the result field in the internal TestCaseFinished event that gets the wrong value. Any formatter or filter that reads and uses that field of the TestCaseFinished event will be wrong. It is easy to point out the summary, because there the result field of TestCaseFinished events are used for sure. I have not looked how may formatters actually use that field. Formatters using messages are not affected.

Copy link
Copy Markdown
Contributor

@luke-hill luke-hill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 and a bit files out of the 5. Got a few questions, but I get what is being done here.


def not_passing(step_result)
@status = Status::NotPassing.new(step_result) if test_step_result_rankings[step_result.to_message.status] > test_step_result_rankings[status.step_result_message.status]
self
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this self call is needed here, as the not_passing invocation returns self afterwards

private

def not_passing(step_result)
@status = Status::NotPassing.new(step_result) if test_step_result_rankings[step_result.to_message.status] > test_step_result_rankings[status.step_result_message.status]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see we have status defined as an iVar and I assume it's defined somewhere in here, but I can't see it.

step_result.with_duration(duration)
result = result.with_message(%(Undefined step: "#{test_step.text}")) if result.undefined?
result = result.with_appended_backtrace(test_step) unless test_step.hook?
result.describe_to(monitor, result)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this "repeated" rewriting / over assignment is possible because a lot of the describe and other methods are returning self

@@ -118,6 +128,10 @@ def execute(test_step, monitor, &)
def result
raise NoMethodError, 'Override me'
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot see where result (Which is defined in the two child classes - Passing and NotPassing will use the result defined there.

let(:test_steps) { [undefined_step, passing_step] }

it 'emits a test_step_finished event with an undefined result' do
expect(event_bus).to receive(:test_step_finished).with(undefined_step, anything) do |_reported_test_case, result|
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is anything defined.

Or are we simply saying here it's irrelevant what it receives. Wouldn't it be a passing step due to line 255?

end

it 'emits a test_step_finished event with a skipped result' do
expect(event_bus).to receive(:test_step_finished).with(passing_step, anything) do |_reported_test_case, result|
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't all these shadowing wrapper methods be allow just to permit it through. Won't / Shouldn't we be doing diff assertions based on this not appearing?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I think this actually should be an ordered set of results. Maybe like here? https://github.com/site-prism/site_prism/blob/v2.15.1/spec/sections_spec.rb#L50-L55

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants