filebeat: add autodiscover takeover reingest integration test#49606
filebeat: add autodiscover takeover reingest integration test#49606github-actions[bot] wants to merge 3 commits intomainfrom
Conversation
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
This pull request does not have a backport label.
To fixup this pull request, you need to add the backport labels for the needed
|
📝 WalkthroughWalkthroughThis change adds an integration test and corresponding configuration to verify that Filebeat's filestream input with 🚥 Pre-merge checks | ✅ 2✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@filebeat/tests/integration/autodiscover_test.go`:
- Around line 157-188: The test uses a fixed cap of 8 lines in the
require.LessOrEqual assertion comparing newEvents (computed from initialEvents
and eventsAfterTakeOver) which makes the test flaky on slow CI; instead compute
a dynamic cap based on the wall-clock elapsed time between when initialEvents is
recorded and when eventsAfterTakeOver is recorded (use the same scope where
initialEvents, filebeat.Start()/WaitLogsContains and the Sleep happen), then
replace the static "8" in the require.LessOrEqual with a derived maxAllowed
value (e.g., maxAllowed = int(elapsed.Seconds())*estimatedLinesPerSecond +
grace) using conservative estimatedLinesPerSecond and a small grace margin to
tolerate legitimate lines produced while Filebeat was stopped/restarting; update
the assertion around newEvents (the require.LessOrEqual call) to use this
computed maxAllowed so the test is stable across slower hosts.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: aa432a05-f384-49df-b908-e0d77ed02b3e
📒 Files selected for processing (2)
filebeat/tests/integration/autodiscover_test.gofilebeat/tests/integration/testdata/autodiscover/docker-take-over.yml
| filebeat.Stop() | ||
| initialEvents := filebeat.CountFileLines(outputFile) | ||
|
|
||
| cfgYAML := getConfig( | ||
| t, | ||
| map[string]any{ | ||
| "containerID": containerID, | ||
| "homePath": workDir, | ||
| }, | ||
| "autodiscover", | ||
| "docker-take-over.yml", | ||
| ) | ||
| filebeat.WriteConfigFile(cfgYAML) | ||
| filebeat.Start() | ||
|
|
||
| filebeat.WaitLogsContains( | ||
| fmt.Sprintf( | ||
| `"message":"Input 'filestream' starting","service.name":"filebeat","id":"%s-logs"`, | ||
| containerID, | ||
| ), | ||
| 30*time.Second, | ||
| "Filestream did not start for the test container using autodiscover") | ||
|
|
||
| time.Sleep(5 * time.Second) | ||
|
|
||
| eventsAfterTakeOver := filebeat.CountFileLines(outputFile) | ||
| newEvents := eventsAfterTakeOver - initialEvents | ||
| require.LessOrEqual( | ||
| t, | ||
| newEvents, | ||
| 8, | ||
| "autodiscover filestream takeover re-ingested old events", |
There was a problem hiding this comment.
Flaky threshold: fixed 8-line cap can fail without re-ingestion
Line 180 + Lines 184-188 use a fixed wait and static cap, but newEvents also includes legitimate lines produced while Filebeat is stopped/restarting. On slower CI hosts this can exceed 8 even when takeover is correct, making the regression test nondeterministic.
Proposed fix (derive cap from elapsed wall time)
func TestAutodiscoverFilestreamTakeOverDoesNotReingest(t *testing.T) {
@@
- filebeat.Stop()
+ filebeat.Stop()
+ stoppedAt := time.Now()
initialEvents := filebeat.CountFileLines(outputFile)
@@
- time.Sleep(5 * time.Second)
+ time.Sleep(5 * time.Second)
eventsAfterTakeOver := filebeat.CountFileLines(outputFile)
newEvents := eventsAfterTakeOver - initialEvents
+ // flog emits ~1 line/sec. Account for downtime + observation window + small slack.
+ maxExpectedNewEvents := int(time.Since(stoppedAt).Seconds()) + 3
require.LessOrEqual(
t,
newEvents,
- 8,
+ maxExpectedNewEvents,
"autodiscover filestream takeover re-ingested old events",
)
}As per coding guidelines, this is a merge-blocking reliability issue because it can cause false test failures.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@filebeat/tests/integration/autodiscover_test.go` around lines 157 - 188, The
test uses a fixed cap of 8 lines in the require.LessOrEqual assertion comparing
newEvents (computed from initialEvents and eventsAfterTakeOver) which makes the
test flaky on slow CI; instead compute a dynamic cap based on the wall-clock
elapsed time between when initialEvents is recorded and when eventsAfterTakeOver
is recorded (use the same scope where initialEvents,
filebeat.Start()/WaitLogsContains and the Sleep happen), then replace the static
"8" in the require.LessOrEqual with a derived maxAllowed value (e.g., maxAllowed
= int(elapsed.Seconds())*estimatedLinesPerSecond + grace) using conservative
estimatedLinesPerSecond and a small grace margin to tolerate legitimate lines
produced while Filebeat was stopped/restarting; update the assertion around
newEvents (the require.LessOrEqual call) to use this computed maxAllowed so the
test is stable across slower hosts.
GenAI-Assisted: Yes Human-Reviewed: Yes Tool: Cursor-CLI, Model: GPT-5.3 Codex High
- Use files instead of adding the config in the middle of the test - Remove time.Sleep - Fix duplication count
TL;DR
Remediation
Investigation detailsRoot CauseThe run failed consistently across all runners because the same three linter violations were introduced in the new integration test file. This is a source-level lint issue, not an environment-specific failure. Evidence
Validation
Follow-up
What is this? | From workflow: PR Actions Detective Give us feedback! React with 🚀 if perfect, 👍 if helpful, 👎 if not. |
Summary
Adds a new integration regression test for autodiscover + filestream takeover to reproduce issue #49579.
Changes
TestAutodiscoverFilestreamTakeOverDoesNotReingestinfilebeat/tests/integration/autodiscover_test.go.filebeat/tests/integration/testdata/autodiscover/docker-take-over.yml.Test behavior
flogcontainer.id: container-logs) to create initial state/events.take_over.from_ids: [container-logs].Validation
go test ./filebeat/tests/integration -run TestAutodiscoverFilestreamTakeOverDoesNotReingest -tags integration -count=1Requested context: issue #49579.
What is this? | From workflow: Mention in Issue
Give us feedback! React with 🚀 if perfect, 👍 if helpful, 👎 if not.