Feat/cost web by kooksee · Pull Request #35 · pubgo/dix

kooksee · 2026-03-24T03:26:22Z

Summary by CodeRabbit

New Features
- Added context-aware dependency injection APIs
- Introduced provider execution timeout and slow provider threshold configuration options
- Added four new HTTP diagnostic endpoints: runtime statistics, recent errors, diagnostic file queries, and trace event timeline
- Implemented error tracking with recent error history
- Added in-memory and file-based trace event logging with querying capabilities
Documentation
- Added configuration guidance for provider timeouts and slow provider warnings
- Documented trace logging and diagnostic collection via environment variables

- Introduced ProviderRuntimeStats struct to capture metrics such as call count, total duration, average duration, last duration, last error, and last run timestamp. - Enhanced GetProviderDetails to include function file and line number for better debugging. - Implemented GetProviderRuntimeStats method to return sorted runtime statistics for all providers. - Added recordProviderStat method to track provider execution metrics. - Updated the Dix struct to maintain provider statistics. - Modified error handling to optionally print stack traces based on log level. - Added tests for runtime statistics retrieval and validation of provider initialization. - Created new server runtime test file to validate HTTP endpoint for runtime stats.

… logging - Implemented a new tracing system in `dixtrace` package to manage spans and events. - Added `RunDetachedSpan` and `BeginSpan` functions for span creation and management. - Introduced `MemorySink` and `FileSink` for event storage and querying. - Created tests for tracing functionality, ensuring correct span relationships and event logging. - Updated HTTP server logs to include new trace endpoint.

coderabbitai · 2026-03-24T03:26:29Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7825b9d9-4621-491e-aff3-37e02e0bf5a6

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/cost-web

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist · 2026-03-24T03:27:41Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the diagnostic and observability capabilities of the dix dependency injection framework. It introduces mechanisms to monitor provider execution times, detect slow or timed-out providers, and collect detailed trace and error information. These diagnostics are accessible both through console logging, a new JSONL diagnostic file, and a comprehensive HTTP API, making it easier to troubleshoot and understand dependency resolution flows during application startup and runtime. The changes aim to provide developers with deeper insights into the DI container's behavior, improving debugging and performance analysis.

Highlights

Enhanced Diagnostic Capabilities: Introduced new diagnostic features including provider timeout and slow provider warnings, detailed DI trace logging, diagnostic file collection, and in-memory trace querying for deeper insights into dependency resolution.
New HTTP API Endpoints: The dixhttp module now exposes new HTTP API endpoints (/api/runtime-stats, /api/errors, /api/diagnostics, /api/trace) to access these diagnostic data, facilitating real-time monitoring and troubleshooting.
Context-Aware Injection: Added InjectContext and InjectTContext methods to allow dependency injection operations to propagate context.Context, enabling better integration with tracing and request-scoped concerns.
Improved Error Reporting: Enhanced internal error handling and reporting to provide more granular information, including specific error types, root causes, and actionable hints, which are also exposed via the new diagnostic APIs.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces comprehensive diagnostic and tracing capabilities to the dependency injection framework. Key changes include adding provider timeout and slow provider warning options, implementing context-aware injection methods, and introducing a new dixtrace package for in-memory event tracing. It also adds support for collecting detailed diagnostic records to a file (DIX_DIAG_FILE) and exposes new HTTP API endpoints in dixhttp for querying runtime statistics, recent errors, diagnostic files, and trace timelines. The documentation has been updated to reflect these new features, and the InjectContext function was corrected to prevent silent failures when injecting into struct values by requiring a pointer.

dix.go

kooksee · 2026-03-24T05:45:23Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces a comprehensive set of features for diagnostics, tracing, and observability. This includes provider timeouts, slow provider warnings, detailed error recording, and a new dixtrace package for lightweight tracing. The dixhttp web UI is enhanced with new endpoints to visualize this data. The changes are extensive and well-tested. I have a few suggestions regarding code consistency, particularly with hardcoded language strings and file handling behavior, to improve maintainability and user experience.

gemini-code-assist · 2026-03-24T05:48:48Z

dixinternal/dix.go

+func buildErrorHint(operation, stage string, timedOut bool) string {
+	if timedOut {
+		return "provider 初始化超时：可先优化初始化逻辑，或临时调大 WithProviderTimeout；排查时可用 /api/errors 与 /api/runtime-stats 联动定位"
+	}
+
+	switch operation {
+	case "provide", "try_provide":
+		return "检查 provider 签名（必须是函数且返回值数量合法）；若不希望中断启动，请优先用 TryProvide 并继续启动 Web 诊断页"
+	case "inject", "try_inject":
+		switch stage {
+		case "cycle_check":
+			return "检测到循环依赖：请拆分相互引用的组件，或改为接口/延迟注入；可用 /api/dependencies 观察依赖环"
+		default:
+			return "注入失败：先确认依赖是否已注册、类型是否一致；生产路径建议改用 TryInject 避免进程直接退出"
+		}
+	case "provider_execute":
+		switch stage {
+		case "resolve_input":
+			return "provider 输入依赖未解析：检查对应输入类型是否有 provider、命名空间(map/list)是否匹配、是否遗漏导入"
+		case "return_error":
+			return "provider 主动返回 error：优先检查外部资源可用性（DB/Redis/HTTP）与配置参数；必要时在 provider 内增加上下文日志"
+		default:
+			return "provider 执行失败：查看 provider_function、output_type 与 input_types 定位具体构造链路"
+		}
+	default:
+		return "可先查看 /api/errors 获取最近错误详情，并结合 /api/runtime-stats 定位慢/失败 provider"
+	}
+}


The error hints in buildErrorHint are hardcoded in Chinese. While helpful, this is inconsistent with the rest of the library's code and documentation, which is in English. This could pose a challenge for non-Chinese speaking users and for future maintenance. It would be better to either provide these hints in English or implement a simple internationalization (i18n) mechanism.

gemini-code-assist · 2026-03-24T05:48:48Z

example/http/main.go

+		logStartupScenarioResult(di, "provider_timeout", err, before)
+	}
+
+	// 循环依赖演示使用临时容器，避免污染主容器导致后续所有注入失败。


This comment is in Chinese, while the rest of the code and comments in this file are in English. For consistency and to make the example accessible to a wider audience, it would be better to write comments in English.

gemini-code-assist · 2026-03-24T05:48:48Z

dixtrace/trace.go

+	if dir != "." && dir != "" {
+		_ = os.MkdirAll(dir, 0o755)
+	}
+	fd, err := os.OpenFile(f.path, os.O_CREATE|os.O_WRONLY|os.O_TRUNC, 0o644)


The FileSink for DIX_TRACE_FILE truncates the file on creation (O_TRUNC), while the sink for DIX_DIAG_FILE (in dixinternal/diag_file.go) appends to its file. This inconsistency can be surprising and lead to data loss if a user expects trace files to be appended across application restarts. It would be better to use O_APPEND for consistency with DIX_DIAG_FILE, or clearly document this truncating behavior.

coderabbitai

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)

dixhttp/README.md (1)

405-414: ⚠️ Potential issue | 🟡 Minor

Markdown syntax issue: unclosed code block.

The JSON code block for /api/type/{typeName}?depth=2 (starting around line 395) is not properly closed before the next ### GET /api/group-rules section. There's also an extra closing triple-backtick at line 414 that appears orphaned.

📝 Suggested fix

   ]
 }
+```
 
 ### GET `/api/group-rules`
 Returns backend-registered group rules (used as UI defaults)
 
 ```json
 [
   {"name": "service", "prefixes": ["github.com/acme/app/service"]}
 ]

-```

</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @dixhttp/README.md around lines 405 - 414, The README has an unclosed JSON
code block for the GET /api/type/{typeName}?depth=2 section and an orphaned
triple-backtick before the ### GET /api/group-rules heading; close the earlier
JSON block properly (add the missing json block terminator) and remove the extra standalone that follows the ### GET /api/group-rules section so both
code blocks are correctly delimited and the headings render properly.


</details>

</blockquote></details>
<details>
<summary>dixhttp/README_zh.md (1)</summary><blockquote>

`386-395`: _⚠️ Potential issue_ | _🟡 Minor_

**Markdown syntax issue: unclosed code block (same as English README).**

The JSON code block for `/api/type/{typeName}?depth=2` is not properly closed before the `/api/group-rules` section, and there's an orphaned closing triple-backtick at line 395.


<details>
<summary>📝 Suggested fix</summary>

```diff
   ]
 }
+```
 
 ### GET `/api/group-rules`
 返回后端注册的分组清单（前端默认配置）
 
 ```json
 [
   {"name": "service", "prefixes": ["github.com/acme/app/service"]}
 ]
 ```
-```
```
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@dixhttp/README_zh.md` around lines 386 - 395, There is an unclosed code block
in README_zh.md: close the JSON fenced code block introduced in the
`/api/type/{typeName}?depth=2` section so it does not bleed into the subsequent
`GET /api/group-rules` section; locate the code fences around the JSON example
(the triple-backtick starting the JSON block) and add the matching closing
triple-backtick immediately after the JSON example, ensuring the `GET
/api/group-rules` heading and its JSON example are outside that block.
```

</details>

</blockquote></details>
<details>
<summary>dixinternal/provider.go (1)</summary><blockquote>

`52-70`: _⚠️ Potential issue_ | _🟡 Minor_

**Consider bilingual or English-only hint message.**

The `hint` field on line 69 is in Chinese only. For international users, this may not be helpful. Consider providing an English version or bilingual message.


<details>
<summary>📝 Suggested fix</summary>

```diff
-				"hint", "provider 内部发生 panic：建议在 provider 内捕获异常并返回 error；可临时开启 debug 日志查看堆栈")
+				"hint", "provider panicked: catch exceptions inside provider and return error; enable debug logging for stack trace")
```
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@dixinternal/provider.go` around lines 52 - 70, The panic handler's
logger.Error call currently uses a Chinese-only hint string; change it to a
bilingual (Chinese + English) or English-only message and wire that into the
existing logger call: update the hint value in the defer recover block (the
anonymous func that calls maybePrintStack and sets err), e.g., replace the
Chinese-only literal passed as the "hint" field with a new variable or literal
containing both languages (or English), ensuring you preserve surrounding fields
(error, error_type from buildErrorType, fn_name via GetFnName(n.fn), fn_type,
input_data, etc.) so the log entry remains structured and only the hint text is
internationalized.
```

</details>

</blockquote></details>

</blockquote></details>

🧹 Nitpick comments (4)

dixinternal/option.go (1)
34-45: Clarification: Merge vs direct option application.

The Merge method's > 0 guard is intentional - it prevents merging zero values from overwriting existing non-zero values. However, since newDix applies options directly via the Option function type (not through Merge), WithProviderTimeout(0) correctly disables timeout by directly setting opts.ProviderTimeout = 0 after defaults are applied.

This design is correct but subtle. Consider adding a comment to Merge explaining when it's used vs direct option application.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@dixinternal/option.go` around lines 34 - 45, The Merge method on Options uses
>0 guards to avoid overwriting existing non-zero values, which is intentional
and different from the direct Option application used by constructors like
newDix where Option functions (e.g., WithProviderTimeout) set fields directly
(so WithProviderTimeout(0) can disable timeouts after defaults are applied); add
a concise comment above func (o Options) Merge(opt Options) explaining this
distinction: state that Merge preserves existing non-zero values and is used for
merging two Options structs, whereas Option functions passed to newDix apply
directly and may intentionally set zero values, with references to Merge,
Options, newDix, and WithProviderTimeout for clarity.
dixtrace/trace_test.go (1)
144-160: Consider verifying truncated content is valid JSONL.

The test confirms the seed content is removed but doesn't verify the written event is valid JSON. This could miss encoding issues in FileSink.
💡 Optional: Add content format verification
 	if strings.Contains(string(b), "seed\n") {
 		t.Fatalf("expected existing content to be truncated, got: %q", string(b))
 	}
+	if !strings.Contains(string(b), `"event":"span.start"`) {
+		t.Fatalf("expected valid JSON event, got: %q", string(b))
+	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@dixtrace/trace_test.go` around lines 144 - 160, Update
TestFileSinkAlwaysTruncatesExistingFile to not only assert the seed content was
removed but also validate that the new file contains valid JSONL for the written
event: after reading the file (using the existing os.ReadFile result), split
into lines and unmarshal the first non-empty line into the Event struct (from
the diff) and assert fields like ID==1, OccurredAt==1 and Event=="span.start";
this ensures NewFileSink/FileSink output is valid JSONL and guards against
encoding issues.
dixinternal/diag_file.go (2)
29-58: Consider consolidating duplicate struct definitions.

diagFileRecord and DiagFileRecord are identical except for visibility. Consider using a single exported type or embedding to reduce duplication.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@dixinternal/diag_file.go` around lines 29 - 58, The two structs
diagFileRecord and DiagFileRecord are identical; remove the duplicate by keeping
a single canonical definition (prefer keeping the exported DiagFileRecord) and
replace the other with a type alias (e.g., add "type diagFileRecord =
DiagFileRecord") so callers within the package can still use the unexported
name; update any references to the removed struct name to use the alias or the
exported type and ensure JSON tags remain unchanged.
345-425: Memory concern: Entire file loaded before filtering.

ReadDiagFileRecords reads all matching records into memory before sorting and paginating. For large diagnostic files, this could cause memory pressure. Consider streaming/early-termination optimizations for production use.
💡 Consider adding a hard cap or streaming approach

For now, you could add a pre-filter cap to limit memory usage:
 	lineNo := 0
 	all := make([]DiagFileRecord, 0, 256)
+	const maxRecordsToLoad = 50000 // Hard cap to prevent OOM
 	for scanner.Scan() {
+		if len(all) >= maxRecordsToLoad {
+			break
+		}
 		lineNo++
This is a tradeoff - it may miss older records, but prevents unbounded memory growth.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@dixinternal/diag_file.go` around lines 345 - 425, The current implementation
accumulates all matching records in the slice all before sorting/paginating
which can OOM on large files; change ReadDiagFileRecords to stream and keep only
up to limit matching records using a min-heap (priority queue) keyed by
OccurredAt (and RecordID as tiebreaker) so you maintain the top-N most recent
records without storing everything. Concretely: in the scanner loop increment a
match counter for result.Total, push each matching rec into a fixed-size
min-heap (use a small struct comparator based on rec.OccurredAt then RecordID),
and if heap size > limit pop the smallest; after scanning drain heap into a
slice, sort that slice descending (same comparator used today), set
result.Records, result.Returned, and result.NextBefore from the last returned
RecordID; keep error handling and normalizeDiagRecordID/diagRecordMatchesQuery
calls as-is and use defaultDiagReadLimit(query.Limit) for heap capacity.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@dixinternal/api.go`:
- Around line 255-305: GetProviderRuntimeStats (and similarly GetRecentErrors)
reads dix.providerStats and dix.recentErrors concurrently while
recordProviderStat and recordRecentErrorWithContext mutate them; snapshot these
structures under a Dix RWMutex to avoid concurrent map/slice panics and torn
snapshots: add an RWMutex field to Dix (e.g., statsMu), acquire a read lock in
GetProviderRuntimeStats, copy dix.providerStats (map) and any slices you iterate
(and recentErrors in GetRecentErrors) into local variables while holding the
lock, release the lock and then build/sort the ProviderRuntimeStats from the
local copies; ensure recordProviderStat and recordRecentErrorWithContext use the
same mutex for writes (write lock) when mutating providerStats/recentErrors.

In `@dixinternal/dix.go`:
- Around line 412-423: The provider runtime stat is using the expanded
registration type (p.output.typ) which can differ from the function's declared
return type; update the stat creation in the block that sets
dix.providerStats[p.fn] to use the function's declared return type obtained from
p.fn.Type().Out(0).String() for OutputType instead of p.output.typ.String(),
keeping other fields (FunctionName via GetFnName(p.fn) and map insertion)
unchanged so providerRuntimeStat reflects the original function signature.

In `@dixinternal/provider.go`:
- Around line 76-97: The timeout implementation in providerFn.callWithTimeout
currently leaves the goroutine running after the caller unblocks, so document
this behavior and recommend cooperative cancellation: update the
WithProviderTimeout option's documentation (and any public docs for provider
timeouts) to state that timeouts only unblock the caller and do not cancel or
stop the provider goroutine, that side effects and resource usage may continue,
and advise implementers to accept a context or implement their own
cancellation-aware logic; also add a note suggesting a future enhancement to add
a context parameter to provider functions for cooperative cancellation
(reference providerFn.callWithTimeout and the WithProviderTimeout option in your
docs).

In `@dixtrace/trace.go`:
- Around line 331-337: The directory and file creation in the trace persistence
code use world-readable perms; update the mkdir and file permissions to be
private and enforce them after creation: when creating the dir for f.path, use
0700 instead of 0755 (MkdirAll), open the file with flags as before but use 0600
instead of 0644, and immediately call Chmod(0o600) on the returned fd (or the
created path) to ensure permissions apply regardless of umask; apply the same
change pattern used around f.path/OpenFile/Chmod in dixtrace/trace.go and mirror
it in dixinternal/diag_file.go for the Event persistence logic.

---

Outside diff comments:
In `@dixhttp/README_zh.md`:
- Around line 386-395: There is an unclosed code block in README_zh.md: close
the JSON fenced code block introduced in the `/api/type/{typeName}?depth=2`
section so it does not bleed into the subsequent `GET /api/group-rules` section;
locate the code fences around the JSON example (the triple-backtick starting the
JSON block) and add the matching closing triple-backtick immediately after the
JSON example, ensuring the `GET /api/group-rules` heading and its JSON example
are outside that block.

In `@dixhttp/README.md`:
- Around line 405-414: The README has an unclosed JSON code block for the GET
`/api/type/{typeName}?depth=2` section and an orphaned triple-backtick before
the `### GET /api/group-rules` heading; close the earlier JSON block properly
(add the missing ```json block terminator) and remove the extra standalone ```
that follows the `### GET /api/group-rules` section so both code blocks are
correctly delimited and the headings render properly.

In `@dixinternal/provider.go`:
- Around line 52-70: The panic handler's logger.Error call currently uses a
Chinese-only hint string; change it to a bilingual (Chinese + English) or
English-only message and wire that into the existing logger call: update the
hint value in the defer recover block (the anonymous func that calls
maybePrintStack and sets err), e.g., replace the Chinese-only literal passed as
the "hint" field with a new variable or literal containing both languages (or
English), ensuring you preserve surrounding fields (error, error_type from
buildErrorType, fn_name via GetFnName(n.fn), fn_type, input_data, etc.) so the
log entry remains structured and only the hint text is internationalized.

---

Nitpick comments:
In `@dixinternal/diag_file.go`:
- Around line 29-58: The two structs diagFileRecord and DiagFileRecord are
identical; remove the duplicate by keeping a single canonical definition (prefer
keeping the exported DiagFileRecord) and replace the other with a type alias
(e.g., add "type diagFileRecord = DiagFileRecord") so callers within the package
can still use the unexported name; update any references to the removed struct
name to use the alias or the exported type and ensure JSON tags remain
unchanged.
- Around line 345-425: The current implementation accumulates all matching
records in the slice all before sorting/paginating which can OOM on large files;
change ReadDiagFileRecords to stream and keep only up to limit matching records
using a min-heap (priority queue) keyed by OccurredAt (and RecordID as
tiebreaker) so you maintain the top-N most recent records without storing
everything. Concretely: in the scanner loop increment a match counter for
result.Total, push each matching rec into a fixed-size min-heap (use a small
struct comparator based on rec.OccurredAt then RecordID), and if heap size >
limit pop the smallest; after scanning drain heap into a slice, sort that slice
descending (same comparator used today), set result.Records, result.Returned,
and result.NextBefore from the last returned RecordID; keep error handling and
normalizeDiagRecordID/diagRecordMatchesQuery calls as-is and use
defaultDiagReadLimit(query.Limit) for heap capacity.

In `@dixinternal/option.go`:
- Around line 34-45: The Merge method on Options uses >0 guards to avoid
overwriting existing non-zero values, which is intentional and different from
the direct Option application used by constructors like newDix where Option
functions (e.g., WithProviderTimeout) set fields directly (so
WithProviderTimeout(0) can disable timeouts after defaults are applied); add a
concise comment above func (o Options) Merge(opt Options) explaining this
distinction: state that Merge preserves existing non-zero values and is used for
merging two Options structs, whereas Option functions passed to newDix apply
directly and may intentionally set zero values, with references to Merge,
Options, newDix, and WithProviderTimeout for clarity.

In `@dixtrace/trace_test.go`:
- Around line 144-160: Update TestFileSinkAlwaysTruncatesExistingFile to not
only assert the seed content was removed but also validate that the new file
contains valid JSONL for the written event: after reading the file (using the
existing os.ReadFile result), split into lines and unmarshal the first non-empty
line into the Event struct (from the diff) and assert fields like ID==1,
OccurredAt==1 and Event=="span.start"; this ensures NewFileSink/FileSink output
is valid JSONL and guards against encoding issues.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 091174f1-f87a-40fb-bc02-08a187a18249

📥 Commits

Reviewing files that changed from the base of the PR and between cbd40cc and a425689.

📒 Files selected for processing (23)

.gitignore
README.md
README_zh.md
dix.go
dixhttp/README.md
dixhttp/README_zh.md
dixhttp/server.go
dixhttp/server_runtime_test.go
dixhttp/template.html
dixinternal/api.go
dixinternal/diag_file.go
dixinternal/diag_file_query_test.go
dixinternal/dix.go
dixinternal/dix_test.go
dixinternal/logger.go
dixinternal/logger_test.go
dixinternal/option.go
dixinternal/provider.go
dixinternal/trace_chain_test.go
dixinternal/util.go
dixtrace/trace.go
dixtrace/trace_test.go
example/http/main.go

coderabbitai · 2026-03-24T06:16:55Z

dixinternal/api.go

+func (dix *Dix) GetProviderRuntimeStats() []ProviderRuntimeStats {
+	stats := make([]ProviderRuntimeStats, 0, len(dix.providers))
+	seen := make(map[reflect.Value]bool)
+
+	for _, providerList := range dix.providers {
+		for _, p := range providerList {
+			if p == nil || seen[p.fn] {
+				continue
+			}
+			seen[p.fn] = true
+
+			outputType := ""
+			if p.output != nil && p.output.typ != nil {
+				outputType = p.output.typ.String()
+			}
+
+			item := ProviderRuntimeStats{
+				FunctionName: GetFnName(p.fn),
+				OutputType:   outputType,
+			}
+
+			if s, ok := dix.providerStats[p.fn]; ok && s != nil {
+				avg := time.Duration(0)
+				if s.CallCount > 0 {
+					avg = s.TotalDuration / time.Duration(s.CallCount)
+				}
+				item.FunctionName = s.FunctionName
+				if s.OutputType != "" {
+					item.OutputType = s.OutputType
+				}
+				item.CallCount = s.CallCount
+				item.TotalDuration = s.TotalDuration
+				item.AverageDuration = avg
+				item.LastDuration = s.LastDuration
+				item.LastError = s.LastError
+				item.LastRunAtUnixNano = s.LastRunAt.UnixNano()
+			}
+
+			stats = append(stats, item)
+		}
+	}
+
+	sort.Slice(stats, func(i, j int) bool {
+		if stats[i].TotalDuration == stats[j].TotalDuration {
+			return stats[i].FunctionName < stats[j].FunctionName
+		}
+		return stats[i].TotalDuration > stats[j].TotalDuration
+	})
+
+	return stats
+}


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, let's examine the Dix struct definition and understand its current fields fd "dix.go" dixinternal/ --exec head -100 {}

Repository: pubgo/dix

Length of output: 2288

🏁 Script executed:

# Search for the Dix struct definition to see if there's already a mutex rg "type Dix struct" -A 30 dixinternal/

Repository: pubgo/dix

Length of output: 1403

🏁 Script executed:

# Look at the second getter around lines 309-342 sed -n '309,342p' dixinternal/api.go

Repository: pubgo/dix

Length of output: 1001

🏁 Script executed:

# Search for recordProviderStat and recordRecentErrorWithContext implementations rg "func.*recordProviderStat|func.*recordRecentErrorWithContext" -A 10 dixinternal/

Repository: pubgo/dix

Length of output: 995

🏁 Script executed:

# Search for HTTP endpoint handlers for /api/runtime-stats and /api/errors rg "runtime-stats|/api/errors" -r dixinternal/

Repository: pubgo/dix

Length of output: 1418

🏁 Script executed:

# Get the full implementation of recordProviderStat rg "func $dix \*Dix$ recordProviderStat" -A 30 dixinternal/

Repository: pubgo/dix

Length of output: 1368

🏁 Script executed:

# Get the full implementation of recordRecentErrorWithContext rg "func $dix \*Dix$ recordRecentErrorWithContext" -A 40 dixinternal/

Repository: pubgo/dix

Length of output: 1981

🏁 Script executed:

# Check the HTTP handlers rg "HandleRuntimeStats|HandleErrors" -A 15 dixhttp/

Repository: pubgo/dix

Length of output: 5412

🏁 Script executed:

# Search for where recordProviderStat is called to see if there's any locking context rg "recordProviderStat" dixinternal/ -B 3 -A 1

Repository: pubgo/dix

Length of output: 1398

Snapshot diagnostics under a lock before serving them.

GetProviderRuntimeStats() and GetRecentErrors() read from dix.providerStats and dix.recentErrors while recordProviderStat() and recordRecentErrorWithContext() mutate them on every provider execution and injection failure. With HTTP handlers calling these getters concurrently, concurrent map reads and slice appends can trigger Go's concurrent map/slice panic or return torn snapshots. Add an RWMutex on Dix and copy both structures under the read lock before building the response.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@dixinternal/api.go` around lines 255 - 305, GetProviderRuntimeStats (and similarly GetRecentErrors) reads dix.providerStats and dix.recentErrors concurrently while recordProviderStat and recordRecentErrorWithContext mutate them; snapshot these structures under a Dix RWMutex to avoid concurrent map/slice panics and torn snapshots: add an RWMutex field to Dix (e.g., statsMu), acquire a read lock in GetProviderRuntimeStats, copy dix.providerStats (map) and any slices you iterate (and recentErrors in GetRecentErrors) into local variables while holding the lock, release the lock and then build/sort the ProviderRuntimeStats from the local copies; ensure recordProviderStat and recordRecentErrorWithContext use the same mutex for writes (write lock) when mutating providerStats/recentErrors.

coderabbitai · 2026-03-24T06:16:55Z

dixinternal/dix.go

+	stat, ok := dix.providerStats[p.fn]
+	if !ok {
+		outputType := ""
+		if p.output != nil && p.output.typ != nil {
+			outputType = p.output.typ.String()
+		}
+		stat = &providerRuntimeStat{
+			FunctionName: GetFnName(p.fn),
+			OutputType:   outputType,
+		}
+		dix.providerStats[p.fn] = stat
+	}


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's find and examine the relevant files find . -type f -name "*.go" | grep -E "(dix|provider)" | head -20

Repository: pubgo/dix

Length of output: 543

🏁 Script executed:

# Look at the structure of the dix.go file around the lines mentioned wc -l dixinternal/dix.go

Repository: pubgo/dix

Length of output: 76

🏁 Script executed:

# Examine the provider-related code and registration logic rg "type.*Provider" dixinternal/ -A 5 | head -100

Repository: pubgo/dix

Length of output: 1744

🏁 Script executed:

# Look for how p.output.typ is set and what it represents rg "output\.typ" dixinternal/ -B 3 -A 3

Repository: pubgo/dix

Length of output: 3120

🏁 Script executed:

# Check the GetProviderRuntimeStats function mentioned in the comment rg "GetProviderRuntimeStats" dixinternal/ -A 20

Repository: pubgo/dix

Length of output: 2884

🏁 Script executed:

# Let's examine the provider structure and understand what p.fn and p.output are sed -n '1,150p' dixinternal/provider.go | head -100

Repository: pubgo/dix

Length of output: 2475

🏁 Script executed:

# Look at the full provider struct definition rg "type.*provider.*struct" dixinternal/ -A 20

Repository: pubgo/dix

Length of output: 3663

🏁 Script executed:

# Examine how providers with slice/map returns are registered sed -n '700,900p' dixinternal/dix.go

Repository: pubgo/dix

Length of output: 7270

🏁 Script executed:

# Check what GetFnName and related functions do rg "func GetFnName" dixinternal/ -A 5

Repository: pubgo/dix

Length of output: 297

🏁 Script executed:

# Look for any struct-return provider expansion logic rg "struct" dixinternal/dix.go | head -20

Repository: pubgo/dix

Length of output: 1233

🏁 Script executed:

# Examine the complete registration logic for different output types sed -n '600,750p' dixinternal/dix.go

Repository: pubgo/dix

Length of output: 4038

🏁 Script executed:

# Let's look at the complete registration logic to see how providers are registered sed -n '500,650p' dixinternal/dix.go

Repository: pubgo/dix

Length of output: 4293

🏁 Script executed:

# Check if there are tests for slice/map/struct provider registration rg "func.*\[\].*{" dixinternal/dix_test.go -B 2 -A 10 | head -50

Repository: pubgo/dix

Length of output: 142

🏁 Script executed:

# Let's look at the struct-return provider handling sed -n '650,800p' dixinternal/dix.go

Repository: pubgo/dix

Length of output: 4777

🏁 Script executed:

# Verify what p.fn.Type().Out(0) would return vs p.output.typ rg -A 30 "case reflect.Slice:" dixinternal/dix.go | head -50

Repository: pubgo/dix

Length of output: 2266

🏁 Script executed:

# Look at the Provide function to understand provider registration completely sed -n '250,450p' dixinternal/dix.go

Repository: pubgo/dix

Length of output: 6473

🏁 Script executed:

# Check the struct-return provider registration rg "reflect.Struct" dixinternal/dix.go -B 5 -A 15

Repository: pubgo/dix

Length of output: 4606

🏁 Script executed:

# Look at the API code to see if it has similar issues sed -n '1,100p' dixinternal/api.go

Repository: pubgo/dix

Length of output: 4233

🏁 Script executed:

# Check the complete GetProviderRuntimeStats function rg "func.*GetProviderRuntimeStats" dixinternal/api.go -A 50

Repository: pubgo/dix

Length of output: 1378

🏁 Script executed:

# Look for tests related to runtime stats for aggregate providers rg "GetProviderRuntimeStats\|outputType\|OutputType" dixinternal/dix_test.go -B 3 -A 3

Repository: pubgo/dix

Length of output: 35

🏁 Script executed:

# Verify that p.fn always has at least one output (to validate the safety check necessity) rg "Provide.*func" dixinternal/dix.go -B 5 -A 15 | head -80

Repository: pubgo/dix

Length of output: 2626

🏁 Script executed:

# Check if there's any validation in handleProvide about output count sed -n '300,350p' dixinternal/dix.go

Repository: pubgo/dix

Length of output: 2137

🏁 Script executed:

# Check if providers can have zero outputs (validate necessity of NumOut() check) rg "NumOut.*0\|NumOut.*<" dixinternal/dix.go

Repository: pubgo/dix

Length of output: 35

🏁 Script executed:

# Verify that all registered providers have at least one output sed -n '165,200p' dixinternal/dix.go

Repository: pubgo/dix

Length of output: 1239

🏁 Script executed:

# Let's check the complete Provide validation to understand if providers can have zero outputs rg "func.*Provide\(" dixinternal/dix.go -A 30 | head -80

Repository: pubgo/dix

Length of output: 1452

🏁 Script executed:

# Check what the actual validation is for providers sed -n '900,1000p' dixinternal/dix.go

Repository: pubgo/dix

Length of output: 4287

🏁 Script executed:

# Verify that the provide function validates output count rg "provide.*func" dixinternal/dix.go -B 5 -A 30 | grep -A 30 "func.*provide" | head -50

Repository: pubgo/dix

Length of output: 1724

🏁 Script executed:

# Check the provide method to see if it validates NumOut sed -n '1200,1300p' dixinternal/dix.go

Repository: pubgo/dix

Length of output: 4075

Use the provider's declared return type, not the expanded lookup type.

p.output.typ is the expanded type after registration (element type for slices/maps, field type for structs), not the declared return type. This causes runtime stats to report misleading output types. Use p.fn.Type().Out(0).String() to capture the actual function signature.

Minimal fix for the recorded stat

if !ok { outputType := "" - if p.output != nil && p.output.typ != nil { - outputType = p.output.typ.String() + if p.fn.IsValid() && p.fn.Type().NumOut() > 0 { + outputType = p.fn.Type().Out(0).String() } stat = &providerRuntimeStat{ FunctionName: GetFnName(p.fn), OutputType: outputType, }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@dixinternal/dix.go` around lines 412 - 423, The provider runtime stat is using the expanded registration type (p.output.typ) which can differ from the function's declared return type; update the stat creation in the block that sets dix.providerStats[p.fn] to use the function's declared return type obtained from p.fn.Type().Out(0).String() for OutputType instead of p.output.typ.String(), keeping other fields (FunctionName via GetFnName(p.fn) and map insertion) unchanged so providerRuntimeStat reflects the original function signature.

coderabbitai · 2026-03-24T06:16:55Z

dixinternal/provider.go

+func (n providerFn) callWithTimeout(in []reflect.Value, timeout time.Duration) (outputs []reflect.Value, err error, timedOut bool) {
+	if timeout <= 0 {
+		outputs, err = n.call(in)
+		return outputs, err, false
+	}
+
+	resultCh := make(chan providerCallResult, 1)
+	go func() {
+		outputs, callErr := n.call(in)
+		resultCh <- providerCallResult{outputs: outputs, err: callErr}
+	}()
+
+	timer := time.NewTimer(timeout)
+	defer timer.Stop()
+
+	select {
+	case res := <-resultCh:
+		return res.outputs, res.err, false
+	case <-timer.C:
+		return nil, fmt.Errorf("provider execution timeout after %s", timeout), true
+	}
+}


⚠️ Potential issue | 🟠 Major

Provider goroutine continues after timeout - document this behavior.

When a timeout occurs, the goroutine running n.call(in) continues executing in the background. The result is discarded via the buffered channel, but:

The provider may still complete side effects (DB writes, API calls, etc.)

Resources held by the provider won't be released until it completes

Multiple timeouts can accumulate background goroutines

This is a fundamental Go limitation (no goroutine preemption), but users should be aware that timeout doesn't cancel the provider - it only unblocks the caller.

Consider documenting this behavior in the WithProviderTimeout option or adding a context parameter to providers for cooperative cancellation in the future.

Would you like me to draft documentation text explaining this timeout behavior, or open an issue to track adding context-based cancellation support?

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@dixinternal/provider.go` around lines 76 - 97, The timeout implementation in providerFn.callWithTimeout currently leaves the goroutine running after the caller unblocks, so document this behavior and recommend cooperative cancellation: update the WithProviderTimeout option's documentation (and any public docs for provider timeouts) to state that timeouts only unblock the caller and do not cancel or stop the provider goroutine, that side effects and resource usage may continue, and advise implementers to accept a context or implement their own cancellation-aware logic; also add a note suggesting a future enhancement to add a context parameter to provider functions for cooperative cancellation (reference providerFn.callWithTimeout and the WithProviderTimeout option in your docs).

coderabbitai · 2026-03-24T06:16:55Z

dixtrace/trace.go

+	dir := filepath.Dir(f.path)
+	if dir != "." && dir != "" {
+		_ = os.MkdirAll(dir, 0o755)
+	}
+	fd, err := os.OpenFile(f.path, os.O_CREATE|os.O_WRONLY|os.O_TRUNC, 0o644)
+	if err != nil {
+		return nil


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's explore the structure and find the trace.go file find . -name "trace.go" -type f | head -20

Repository: pubgo/dix

Length of output: 72

🏁 Script executed:

# Get the file structure to understand the codebase git ls-files | grep -E "(trace|dixtrace)" | head -20

Repository: pubgo/dix

Length of output: 125

🏁 Script executed:

# Get file size and read the relevant section wc -l ./dixtrace/trace.go

Repository: pubgo/dix

Length of output: 76

🏁 Script executed:

# Read the section around lines 331-337 with context sed -n '320,345p' ./dixtrace/trace.go

Repository: pubgo/dix

Length of output: 496

🏁 Script executed:

# Search for Event type definition to understand what data is traced rg "type Event struct" ./dixtrace/trace.go -A 10

Repository: pubgo/dix

Length of output: 674

🏁 Script executed:

# Look at the Write method to see what's written sed -n '347,370p' ./dixtrace/trace.go

Repository: pubgo/dix

Length of output: 385

🏁 Script executed:

# Search for what data is logged/written rg "Write|Sink" ./dixtrace/trace.go -B 2 -A 5 | head -60

Repository: pubgo/dix

Length of output: 939

🏁 Script executed:

# Get complete Event struct definition sed -n '40,100p' ./dixtrace/trace.go

Repository: pubgo/dix

Length of output: 1260

🏁 Script executed:

# Search for Event struct with more context ast-grep --pattern 'type Event struct { $$$ }'

Repository: pubgo/dix

Length of output: 1674

🏁 Script executed:

# Check for other file creation patterns in the codebase rg "OpenFile|MkdirAll" . --type go -B 2 -A 2

Repository: pubgo/dix

Length of output: 997

🏁 Script executed:

# Check if there are tests that might show what data is logged head -50 ./dixtrace/trace_test.go

Repository: pubgo/dix

Length of output: 1503

🏁 Script executed:

# Search for examples or documentation about the trace data rg "FileSink|MemorySink" . --type go --type md | head -30

Repository: pubgo/dix

Length of output: 1617

Use private file permissions for persisted trace data.

The Event struct includes Message, Error, and arbitrary Attrs, making these files sensitive. Creating directories as 0755 and files as 0644 allows other local users to read diagnostics on shared systems. Change to 0700 for directories and 0600 for files. Call Chmod(0o600) after OpenFile to ensure permissions apply regardless of umask.

Suggested fix

dir := filepath.Dir(f.path) if dir != "." && dir != "" { - _ = os.MkdirAll(dir, 0o755) + _ = os.MkdirAll(dir, 0o700) } - fd, err := os.OpenFile(f.path, os.O_CREATE|os.O_WRONLY|os.O_TRUNC, 0o644) + fd, err := os.OpenFile(f.path, os.O_CREATE|os.O_WRONLY|os.O_TRUNC, 0o600) if err != nil { return nil } + _ = fd.Chmod(0o600) f.f = fd return f.f

Note: Similar permission issue exists in dixinternal/diag_file.go.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@dixtrace/trace.go` around lines 331 - 337, The directory and file creation in the trace persistence code use world-readable perms; update the mkdir and file permissions to be private and enforce them after creation: when creating the dir for f.path, use 0700 instead of 0755 (MkdirAll), open the file with flags as before but use 0600 instead of 0644, and immediately call Chmod(0o600) on the returned fd (or the created path) to ensure permissions apply regardless of umask; apply the same change pattern used around f.path/OpenFile/Chmod in dixtrace/trace.go and mirror it in dixinternal/diag_file.go for the Event persistence logic.

kooksee and others added 22 commits March 19, 2026 15:13

feat: 添加 Provider 超时和慢调用警告功能，更新文档和示例

6b23196

feat: 添加默认 provider 超时和慢调用阈值，更新相关文档和测试

532f846

feat: 添加最近注入错误的 API 和前端展示功能，更新相关文档和测试

813c725

feat: 优化模板和文档中的代码格式，提升可读性

99b4e12

feat: 增强错误记录功能，添加提供者执行阶段和根因信息

6e474b1

feat: 添加错误提示信息和增强错误记录功能，优化错误处理逻辑

d635a4b

feat: 增强错误处理，添加错误类型和提示信息，优化错误记录功能

532fea7

feat: 添加机器诊断模式支持，优化日志输出和错误记录功能

473fdcb

feat: 添加依赖追踪日志功能，优化依赖解析和注入过程中的日志记录

fb8d00a

feat: 更新事件速查表格格式，优化可读性

f1aa75e

feat: 增强依赖提供功能，添加详细的追踪日志记录

744c713

feat: 添加诊断文件记录功能，优化错误和追踪事件的日志记录

7154f08

qian'du

1eb96ed

feat: 添加 Provider 详细信息弹窗，优化依赖节点信息展示

c3bca64

feat(trace): 更新追踪功能，优化注入和解析过程中的日志记录

144fbaa

feat(trace): 增强追踪功能，添加缩进模式选择和提供者信息展示

4703203

feat(depth): 优化深度输入功能，支持自定义输入和增减按钮

f504e2c

feat(diagnostic): 移除诊断文件记录功能，优化调用链 Trace 模态框

f602608

feat(trace): 添加深度输入功能，优化调用链 Trace 诊断入口

f626430

feat(trace): 优化调用链 Trace 视图，简化深度输入和增强错误信息展示

5d3fcb5

gemini-code-assist bot reviewed Mar 24, 2026

View reviewed changes

dix.go Show resolved Hide resolved

kooksee added 2 commits March 24, 2026 13:09

feat(api): 优化包和类型详情处理，增强错误处理和分页功能

3cafc6f

feat(trace): 优化链路数据处理，支持树形结构排序和过滤功能

a425689

gemini-code-assist bot reviewed Mar 24, 2026

View reviewed changes

coderabbitai bot reviewed Mar 24, 2026

View reviewed changes

kooksee added 2 commits March 24, 2026 14:19

feat(diag): 移除不必要的诊断记录转换函数，简化代码结构

12a5e3c

feat(trace): 增强文件落盘机制，支持从环境变量配置回退到诊断文件

38accf2

kooksee merged commit 637d99a into v2 Mar 24, 2026
2 checks passed

Conversation

kooksee commented Mar 24, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

gemini-code-assist bot commented Mar 24, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

kooksee commented Mar 24, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

kooksee commented Mar 24, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 24, 2026 •

edited

Loading