feat(sight): agent activity monitor via schedmon BPF#662
Closed
jfeng18 wants to merge 10 commits into
Closed
Conversation
jfeng18
added a commit
to jfeng18/anolisa
that referenced
this pull request
Jun 1, 2026
Distinguishes default zero-intrusion monitoring from opt-in subsystems (e.g. the idle-burst-idle scheduler enabled by --enable-scheduler). Sweeps all three product-summary surfaces under src/agentsight: README.md, README_CN.md (full-width punctuation, plain-language phrasing), and the RPM %description in agentsight.spec.in. NOTE: depends on commit 8a7a036 in this same PR, which introduces the --enable-scheduler flag. Do not split this README change out of alibaba#662 or land it ahead of 8a7a036; --enable-scheduler does not exist on main yet. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
jfeng18
added a commit
to jfeng18/anolisa
that referenced
this pull request
Jun 3, 2026
Distinguishes default zero-intrusion monitoring from opt-in subsystems (e.g. the idle-burst-idle scheduler enabled by --enable-scheduler). Sweeps all three product-summary surfaces under src/agentsight: README.md, README_CN.md (full-width punctuation, plain-language phrasing), and the RPM %description in agentsight.spec.in. NOTE: depends on commit 8a7a036 in this same PR, which introduces the --enable-scheduler flag. Do not split this README change out of alibaba#662 or land it ahead of 8a7a036; --enable-scheduler does not exist on main yet. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
35342ed to
bed254c
Compare
jfeng18
added a commit
to jfeng18/anolisa
that referenced
this pull request
Jun 4, 2026
EVENT_SOURCE_SCHED=7 (from alibaba#662 schedmon) would collide with EVENT_SOURCE_LSM=7 on merge. Renumber LSM to 8. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
3a33e35 to
907ad01
Compare
jfeng18
added a commit
to jfeng18/anolisa
that referenced
this pull request
Jun 6, 2026
Distinguishes default zero-intrusion monitoring from opt-in subsystems (e.g. the idle-burst-idle scheduler enabled by --enable-scheduler). Sweeps all three product-summary surfaces under src/agentsight: README.md, README_CN.md (full-width punctuation, plain-language phrasing), and the RPM %description in agentsight.spec.in. NOTE: depends on commit 8a7a036 in this same PR, which introduces the --enable-scheduler flag. Do not split this README change out of alibaba#662 or land it ahead of 8a7a036; --enable-scheduler does not exist on main yet. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
907ad01 to
5d5be6b
Compare
jfeng18
added a commit
to jfeng18/anolisa
that referenced
this pull request
Jun 6, 2026
EVENT_SOURCE_SCHED=7 (from alibaba#662 schedmon) would collide with EVENT_SOURCE_LSM=7 on merge. Renumber LSM to 8. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
jfeng18
added a commit
to jfeng18/anolisa
that referenced
this pull request
Jun 6, 2026
Distinguishes default zero-intrusion monitoring from opt-in subsystems (e.g. the idle-burst-idle scheduler enabled by --enable-scheduler). Sweeps all three product-summary surfaces under src/agentsight: README.md, README_CN.md (full-width punctuation, plain-language phrasing), and the RPM %description in agentsight.spec.in. NOTE: depends on commit 8a7a036 in this same PR, which introduces the --enable-scheduler flag. Do not split this README change out of alibaba#662 or land it ahead of 8a7a036; --enable-scheduler does not exist on main yet. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
5d5be6b to
7ab3d31
Compare
Introduce a userspace blood lineage tree that tracks Agent process families (Agent -> SubAgent -> Tool / Skill). Nodes carry pid/ppid, process type, AGENT_MODE flag, comm and an optional agent name, and maintain parent->child links on insert/remove. classify() assigns a type from the process's ancestry and environment: a child of an Agent/SubAgent becomes SubAgent (if it matches an agent pattern) or Tool; a parentless process with AGENT_MODE=1 becomes an Agent root; everything else stays Unknown. subtree()/roots() expose the forest for queries. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Wire the lineage tree into the event loop. proctrace exec/exit events maintain the tree (insert+classify on exec, remove on exit), inferring AGENT_MODE and agent-pattern matches from the pid->agent-name cache to avoid redundant /proc reads. Add scanner helpers read_ppid() and has_agent_mode() that read /proc/<pid>/stat and /proc/<pid>/environ, used by the procmon path to auto-detect AGENT_MODE=1 roots. ensure_lineage_node() closes a race: proctrace does not emit an exec event for an AGENT_MODE root (it was not yet in traced_processes when it execed), so the procmon detection path inserts and classifies the node directly, making detection order-independent. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Walk a process's ancestry to the nearest Agent root (bounded to 64 hops to guard against cycles), returning None when no Agent ancestor exists. This is the prerequisite the scheduler uses to group a process into its Agent family. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Detect idle/active transitions of traced processes via the BTF-typed sched_switch / sched_wakeup tracepoints. tp_btf gives direct typed access to the task_struct, which is what makes correct detection possible: - sched_switch reads the raw prev->__state and the `preempt` flag, so a task that is only preempted while still runnable is not misread as going to sleep (the format-struct prev_state field is the encoded TASK_REPORT value, which is essentially never 0 and would flag every context switch as a sleep). - both tracepoints resolve tgid (to filter traced Agent families) and tid (the actual thread) from the task_struct, and emit per-tid, so a multithreaded process can be ACTIVE while any one of its threads runs. Per-tid state-dedup (no time cooldown) avoids re-emitting the same state while still delivering every genuine transition; the LRU map self-evicts since schedmon has no thread-exit hook. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add the SchedMon probe (reuses the shared traced_processes and ring-buffer maps, attaches the two BTF tracepoints) and the Event::Sched variant carrying (tgid, tid, event_type). The unified parser treats Sched events as a no-op — they are consumed by the scheduler, not parsed into messages. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Group an Agent family (keyed by the Agent root PID) into a cgroup v2 cpu cgroup and drive cpu.idle from the family's aggregate scheduling state: ACTIVE (cpu.idle=0, cpu.weight=active_weight) while any thread is runnable, IDLE (cpu.idle=1, SCHED_IDLE) once every thread has been blocked longer than idle_threshold_ms. A per-tid runnable set makes "any thread runnable" correct for multithreaded processes. Details that the kernel forces: - cpu.idle is the idle mechanism; we never write cpu.weight while idle (the kernel rejects it and ignores the value), and clear cpu.idle before restoring cpu.weight on the ACTIVE transition. - cpu controller is enabled top-down from the v2 root to cgroup_root so the agent-* leaves actually expose cpu.idle/cpu.weight. Robust teardown so cgroups never leak or strand processes at idle weight: - reap_exited_families() removes a family once its cgroup.procs is empty (proctrace only emits exit for its own child_pids, so an AGENT_MODE root would otherwise never be torn down); - remove_cgroup() evacuates any remaining (fork-without-exec) processes to the cgroup root before rmdir to avoid EBUSY leaks; - a startup sweep clears empty agent-* dirs left by a previous SIGKILL, and Drop cleans up on graceful shutdown. active_weight is clamped to the kernel-valid range; the idle debounce starts only when the runnable set first empties (not on every sleep edge). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Wire the scheduler end-to-end: dispatch the schedmon probe (enabled only when the scheduler is on), route Event::Sched to on_sched_event, register a process with its Agent family on classification (via lineage find_root) and remove it on exit, and finalize debounced idle transitions from a shared on_idle_tick() called by both the CLI run loop and the FFI driver loop (so the scheduler is not stuck never going idle in embedded mode). Adds the --enable-scheduler CLI flag and a JSON `scheduler` config block (active_weight, idle_threshold_ms, cgroup_root); warns when the config file's enabled value overrides the CLI flag. Verified on kernel 6.6.102 with a multithreaded CPU-bound agent: BURST -> ACTIVE within ~10ms, sustained SLEEP -> IDLE within ~150ms, clean per-cycle transitions, no cgroup leaks, no cpu.idle/weight write errors. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Distinguishes default zero-intrusion monitoring from opt-in subsystems (e.g. the idle-burst-idle scheduler enabled by --enable-scheduler). Sweeps all three product-summary surfaces under src/agentsight: README.md, README_CN.md (full-width punctuation, plain-language phrasing), and the RPM %description in agentsight.spec.in. NOTE: depends on commit 8a7a036 in this same PR, which introduces the --enable-scheduler flag. Do not split this README change out of alibaba#662 or land it ahead of 8a7a036; --enable-scheduler does not exist on main yet. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
cpu.idle toggle is architecturally flawed: userspace round-trip (BPF→daemon→cgroup write) is slower than the kernel's first scheduling decision on wakeup. Worse, cpu.idle=1 sets weight to WEIGHT_IDLEPRIO=3, making the waking agent unable to preempt non-idle tasks until the userspace restores cpu.idle=0. Net effect: acceleration anti-pattern. Keep: BPF schedmon probes (sched_switch/sched_wakeup), per-tid activity tracking state machine, debounced idle/active transitions, unit tests. Remove: all cgroup operations (create/remove/write cpu.idle/cpu.weight/ migrate pid), Drop cleanup, sweep_stale_cgroups, active_weight and cgroup_root config fields. Add: per-family transition metrics (idle_to_active_count, active_to_idle_count, duration tracking) for capacity planning. Revert docs commit that scoped "zero-intrusion" claim — no longer needed since the monitor is purely observational. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
B1: procmon.bpf.c used get_task_ns_pid() for event->pid but host tgid for ppid — inconsistent in containers. Use host tgid (from bpf_get_current_pid_tgid()) for both, matching proctrace convention. B2: root agent exit only triggered ProcMonEvent::Exit (not proctrace VariableEvent::Exit), so lineage tree was never cleaned. Add lineage_tree.remove() in ProcMonEvent::Exit handler. Also clean activity_monitor.remove_process() in same handler. I1: LineageTree::remove() now reparents children to grandparent instead of orphaning them (mirrors kernel subreaper behavior). Found via workflow kernel-code cross-reference review against cloud-kernel 6.6 branch. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
7ab3d31 to
c599945
Compare
jfeng18
added a commit
to jfeng18/anolisa
that referenced
this pull request
Jun 6, 2026
EVENT_SOURCE_SCHED=7 (from alibaba#662 schedmon) would collide with EVENT_SOURCE_LSM=7 on merge. Renumber LSM to 8. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Contributor
Author
|
Superseded by #822 — rewrote from cgroup-based scheduler to observation-only activity monitor per review feedback. |
6 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Replaces the cgroup-based idle-burst-idle scheduler with a pure observability activity monitor. The BPF scheduling probes are retained; the cgroup actuation layer is removed.
Why the change
Reviewer feedback (执云/智彻) + kernel code analysis identified fundamental flaws in the cpu.idle toggle approach:
cpu.idle=1sets weight toWEIGHT_IDLEPRIO=3(kernel/sched/fair.c:14141), making waking tasks unable to preempt non-idle tasks (line 9087) until userspace restorescpu.idle=0Correct approach for CPU acceleration/high-density: static
cpu.weightin container spec, or futuresched_extBPF scheduler (in-kernel, zero round-trip). ECS already hasCONFIG_SCHED_CLASS_EXT=y.What's kept
tp_btf/sched_switch+sched_wakeup): per-tid sleep/wakeup detectionWhat's removed
active_weight,cgroup_rootconfig fieldsStacked on #661 (lineage tree). Review top commits only.
Test plan