Skip to content

Harsen/tap 11387 develop#3181

Open
HarsenLin wants to merge 4 commits intodevelopfrom
harsen/TAP-11387-develop
Open

Harsen/tap 11387 develop#3181
HarsenLin wants to merge 4 commits intodevelopfrom
harsen/TAP-11387-develop

Conversation

@HarsenLin
Copy link
Copy Markdown
Collaborator

No description provided.

Add a source heartbeat no-incremental-event alarm that starts after heartbeat-enabled sources have no captured incremental event for 60 seconds.

Record incremental monitor and capture timestamps from observable samples, initialize task alarm settings and templates, and recover the alarm when downstream pending events indicate capture is no longer idle. Harden the alarm path against missing user details, DAG nodes, invalid timestamps, and recovered alarm ordering.

Refs: TAP-11387
…evelop

# Conflicts:
#	manager/tm/src/main/resources/init/idaas/version
@augmentcode
Copy link
Copy Markdown

augmentcode Bot commented Apr 30, 2026

🤖 Augment PR Summary

Summary: Adds monitoring and alarming for “source heartbeat enabled but no incremental events captured”.

Changes:

  • Introduced new node metrics to track incremental-monitor start time, last captured/enqueued incremental timestamps, and a pending flag.
  • Extended stream-read enqueued handling to compute an EventTypeRecorder and feed it into node sampling logic.
  • Updated DataNodeSampleHandler to initialize/store the new metrics and toggle the pending flag on capture/enqueue.
  • Added a new alarm key TASK_SOURCE_NO_INCREMENTAL_EVENT plus webhook text and mail templates.
  • Enhanced MeasureAOP to raise/recover the new alarm (only for running tasks, and only when the source connection has heartbeat enabled).
  • Replaced Hutool DateUnit usage with TimeUnit-based minute calculations.
  • Added/updated tests covering the new alarm flow and some additional skip/edge-case paths.
  • Added iDaaS init data to upsert the new alarm setting and backfill it into tasks.

Technical Notes: The new alarm uses a 1-minute threshold and attempts to suppress alerts when downstream is blocked (pending captured events not yet enqueued).

🤖 Was this summary useful? React with 👍 or 👎

Copy link
Copy Markdown

@augmentcode augmentcode Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 3 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

Comment thread manager/tm-api/src/main/java/com/tapdata/tm/base/aop/MeasureAOP.java Outdated
Comment thread manager/tm-api/src/main/java/com/tapdata/tm/base/aop/MeasureAOP.java Outdated
Guard the stream enqueued metrics consumer before counting Tapdata events.

The enqueued path now matches read-complete handling by skipping countTapdataEvent when the event list is null or empty, avoiding dropped metrics and warning logs from null upstream input.

Refs: TAP-11387
Report the actual source heartbeat idle duration in TASK_SOURCE_NO_INCREMENTAL_EVENT notifications instead of the static 60 second threshold.

Pass the alarm evaluation time into the start-alarm payload and calculate idleSeconds from now minus the selected baseline. Restore the source idle alarm path so it still runs when node alarm rules are absent.

Refs: TAP-11387
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant