[kernel]Rewrite rt_thread_get_usage to use incremental statistics based on sampling windows.#11256
[kernel]Rewrite rt_thread_get_usage to use incremental statistics based on sampling windows.#11256Rbb666 wants to merge 1 commit intoRT-Thread:masterfrom
Conversation
…ed on sampling windows.
|
👋 感谢您对 RT-Thread 的贡献!Thank you for your contribution to RT-Thread! 为确保代码符合 RT-Thread 的编码规范,请在你的仓库中执行以下步骤运行代码格式化工作流(如果格式化CI运行失败)。 🛠 操作步骤 | Steps
完成后,提交将自动更新至 如有问题欢迎联系我们,再次感谢您的贡献!💐 |
📌 Code Review Assignment🏷️ Tag: kernelReviewers: @GorrayLi @ReviewSun @hamburger-os @lianux-mm @wdfk-prog @xu18838022837 Changed Files (Click to expand)
📊 Current Review Status (Last Updated: 2026-03-16 13:29 CST)
📝 Review Instructions
|
|
@supperthomas @Guozhanxin 两位可以帮忙看看 |
| #define RT_CPU_USAGE_CALC_INTERVAL_MS 200U | ||
| #endif | ||
|
|
||
| #define RT_CPU_USAGE_CALC_INTERVAL_TICK \ |
There was a problem hiding this comment.
Pull request overview
This PR refactors rt_thread_get_usage() to report per-thread CPU usage using incremental (delta) statistics over a configurable sampling window, instead of using cumulative counters since boot. It also adds per-thread cached usage fields and a Kconfig option to tune the sampling interval.
Changes:
- Reworked
rt_thread_get_usage()to update usage periodically using per-CPU counter deltas and cached per-thread results. - Added per-thread fields to store previous runtime snapshot and cached CPU usage percentage.
- Introduced
RT_CPU_USAGE_CALC_INTERVAL_MSKconfig option (default 200ms) to control the sampling window.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
src/thread.c |
Initializes new per-thread CPU usage snapshot/cache fields. |
src/kservice.c |
Implements sampling-window based usage update logic and updates rt_thread_get_usage(). |
src/Kconfig |
Adds configurable CPU usage sampling interval option under RT_USING_CPU_USAGE_TRACER. |
include/rtdef.h |
Extends struct rt_thread with snapshot and cached usage fields. |
| if RT_USING_CPU_USAGE_TRACER | ||
| config RT_CPU_USAGE_CALC_INTERVAL_MS | ||
| int "CPU usage sampling interval (ms)" | ||
| default 200 | ||
| range 50 5000 |
There was a problem hiding this comment.
[Best practices/最佳实践]: PR title prefix formatting is slightly off
English: Project convention expects a lowercase prefix like [module][subsystem] Description (or [module] Description) with a space after the prefix. Current title is [kernel]Rewrite ...; suggested: [kernel] Rewrite rt_thread_get_usage to use incremental statistics based on sampling windows.
中文:项目约定的标题前缀格式通常为小写的 [模块][子系统] 描述(或 [模块] 描述),并且前缀后应有空格。当前标题为 [kernel]Rewrite ...;建议改为:[kernel] Rewrite rt_thread_get_usage to use incremental statistics based on sampling windows。
| static rt_tick_t _cpu_usage_sample_tick; | ||
| static rt_bool_t _cpu_usage_inited = RT_FALSE; | ||
| static struct rt_cpu_usage_stats _cpu_usage_prev_cpu_stat[RT_CPUS_NR]; | ||
|
|
There was a problem hiding this comment.
[Concurrency/并发]: _cpu_usage_update() updates shared static state without any synchronization
English: _cpu_usage_sample_tick, _cpu_usage_inited, and _cpu_usage_prev_cpu_stat[] are global statics mutated by _cpu_usage_update() but there is no lock/critical section guarding concurrent callers. If rt_thread_get_usage() is called from multiple threads/CPUs (e.g., shell, procfs, power code) at the same time, snapshots can be partially updated and produce incorrect usage, or tear the sampling window.
中文:_cpu_usage_sample_tick、_cpu_usage_inited 和 _cpu_usage_prev_cpu_stat[] 是全局静态变量,但 _cpu_usage_update() 在更新它们时没有任何锁/临界区保护。如果多个线程/CPU 同时调用 rt_thread_get_usage()(例如 shell/procfs/电源模块等),会出现采样快照被部分更新、采样窗口被撕裂,从而导致 CPU 使用率结果不正确。
English: Consider protecting the whole update path with a dedicated spinlock (or at least a short critical section) so only one caller performs snapshot/refresh at a time.
中文:建议为整个更新路径增加专用的自旋锁(或至少短临界区),确保同一时刻只有一个调用者执行 snapshot/refresh。
| t->total_time_prev = 0U; | ||
| t->cpu_usage = 0U; | ||
| } | ||
| rt_spin_unlock_irqrestore(&info->spinlock, level); | ||
|
|
||
| for (i = 0; i < RT_CPUS_NR; i++) | ||
| { | ||
| _cpu_usage_prev_cpu_stat[i].user = 0U; | ||
| _cpu_usage_prev_cpu_stat[i].system = 0U; | ||
| _cpu_usage_prev_cpu_stat[i].idle = 0U; | ||
| } | ||
|
|
||
| _cpu_usage_sample_tick = rt_tick_get(); | ||
| _cpu_usage_inited = RT_TRUE; | ||
| } |
There was a problem hiding this comment.
[Bug/缺陷]: First sampling window is initialized to 0, so the first computed “window” is actually since-boot (and doc says initial is 0)
English: _cpu_usage_snapshot_init() sets all per-CPU previous counters and each thread’s total_time_prev to 0, then _cpu_usage_update() immediately computes deltas against 0 on the first call (because bypass_interval_check is true). This makes the first returned usage reflect cumulative time since boot/thread start, not the “recent sampling window”, and it contradicts the doc note that the initial cached value is 0.
中文:_cpu_usage_snapshot_init() 将每个 CPU 的 prev 计数器以及每个线程的 total_time_prev 都置为 0,然后 _cpu_usage_update() 在第一次调用时(bypass_interval_check 为真)会立刻用 0 作为基准计算 delta。这样第一次返回的 usage 实际是“自启动/自线程创建以来”的累计占比,而不是“最近采样窗口”,同时也与注释中“初始值为 0”的描述不一致。
English: A more consistent approach is to initialize snapshots to the current counters (per-CPU and per-thread) and return without computing usage until the next interval elapses.
中文:更一致的做法是在初始化时将快照设置为“当前值”(CPU/线程),并在首次初始化后不计算 usage,等待下一个采样窗口到期再更新。
| level = rt_spin_lock_irqsave(&info->spinlock); | ||
| for (node = list->next; node != list; node = node->next) | ||
| { | ||
| struct rt_object *obj = rt_list_entry(node, struct rt_object, list); | ||
| struct rt_thread *t = (struct rt_thread *)obj; | ||
| rt_ubase_t total_now = (rt_ubase_t)(t->user_time + t->system_time); | ||
| rt_ubase_t total_delta_now = (rt_ubase_t)(total_now - t->total_time_prev); | ||
| rt_uint64_t thread_delta = (rt_uint64_t)total_delta_now; | ||
|
|
||
| if (total_delta > 0U) | ||
| { | ||
| rt_uint64_t usage = (thread_delta * 100U) / total_delta; | ||
| t->cpu_usage = (rt_uint8_t)(usage > 100U ? 100U : usage); | ||
| } | ||
| else | ||
| { | ||
| t->cpu_usage = 0U; | ||
| } | ||
|
|
||
| t->total_time_prev = total_now; | ||
| } | ||
| rt_spin_unlock_irqrestore(&info->spinlock, level); |
There was a problem hiding this comment.
[Performance/实时性能]: Holding thread object spinlock (IRQ-save) while doing per-thread 64-bit division over the whole thread list
English: _cpu_usage_refresh_threads() holds info->spinlock with rt_spin_lock_irqsave() while iterating every thread and performing 64-bit division for each entry. On 32-bit MCUs this division is relatively expensive, and keeping IRQs disabled for O(nthreads) work can hurt real-time latency.
中文:_cpu_usage_refresh_threads() 在 rt_spin_lock_irqsave() 持有 info->spinlock 的同时遍历所有线程,并且对每个线程执行 64 位除法。对 32 位 MCU 来说除法开销较大,在中断关闭状态下做 O(线程数) 的工作会明显影响实时性/中断响应。
English: Consider shortening the critical section (e.g., collect thread pointers/snapshots in batches under the lock like list_thread() does, then do math outside the lock and only briefly lock to write back fields).
中文:建议缩短临界区(例如参考 list_thread() 的做法,在锁内分批收集线程指针/快照,锁外完成计算,只在必要时短暂加锁回写字段)。
| #ifdef RT_USING_CPU_USAGE_TRACER | ||
| thread->user_time = 0; | ||
| thread->system_time = 0; |
There was a problem hiding this comment.
[Maintainability/可维护性]: Duplicate initialization of user_time/system_time when RT_USING_SMART is enabled
English: thread->user_time and thread->system_time are already initialized earlier inside the RT_USING_SMART block in this function. Initializing them again under RT_USING_CPU_USAGE_TRACER is redundant for configurations where both are enabled, and increases the chance of future divergence.
中文:在该函数中,thread->user_time 和 thread->system_time 已在前面的 RT_USING_SMART 代码块里初始化过;当同时启用 RT_USING_SMART 与 RT_USING_CPU_USAGE_TRACER 时,这里再次初始化属于冗余,后续维护时也更容易出现两处逻辑不一致。
English: Consider consolidating these initializations so each field is set in only one place for a given build configuration.
中文:建议合并/统一这些初始化逻辑,确保同一构建配置下每个字段只在一个位置赋值。
| #ifdef RT_USING_CPU_USAGE_TRACER | |
| thread->user_time = 0; | |
| thread->system_time = 0; | |
| #ifdef RT_USING_CPU_USAGE_TRACER | |
| #ifndef RT_USING_SMART | |
| thread->user_time = 0; | |
| thread->system_time = 0; | |
| #endif /* !RT_USING_SMART */ |





拉取/合并请求描述:(PR description)
[
将 rt_thread_get_usage 重构为基于采样窗口的增量统计(含 32/64 位回绕安全与结构字段精简)
并确认并列出已经在什么情况或板卡上进行了测试。
And confirm in which case or board has been tested. -->
qemu-a9
]
当前拉取/合并请求的状态 Intent for your PR
必须选择一项 Choose one (Mandatory):
代码质量 Code Quality:
我在这个拉取/合并请求中已经考虑了 As part of this pull request, I've considered the following:
#if 0代码,不包含已经被注释了的代码 All redundant code is removed and cleaned up