Skip to content

[refactor] Semantic Function Clustering Analysis: Two Persistent Outliers + One New Near-Duplicate Pair #3835

@github-actions

Description

@github-actions

Analysis of repository: github/gh-aw-mcpg
Workflow run: §24450105920

Overview

Analysis of 104 non-test Go source files across 24 packages, cataloging 729 functions. The codebase remains well-structured. This run surfaces the same two persistent outliers flagged in #3741 (still unresolved), plus a newly detected near-duplicate pair: parseRateLimitResetHeader / parseRateLimitReset across the server and proxy packages.

Full Report

Function Inventory

By Package

Package Files ~Functions Primary Purpose
internal/auth 2 7 Authentication header parsing, API key generation
internal/cmd 9 22 CLI commands (Cobra), flag registration
internal/config 12 47 Configuration parsing (TOML/JSON), validation
internal/difc 8 40 Decentralized Information Flow Control
internal/envutil 3 5 Environment variable utilities
internal/guard 6 36 Security guards (WASM, NoopGuard, WriteSink, Registry)
internal/httputil 1 1 Shared HTTP helpers
internal/launcher 4 20 Backend process/connection management
internal/logger 15 80 Multi-format logging framework
internal/mcp 6 55 MCP protocol types and connections
internal/middleware 1 8 jq schema processing middleware
internal/oidc 2 6 OIDC token provider
internal/proxy 6 42 GitHub API filtering proxy
internal/server 15 83 HTTP server (routed/unified modes)
internal/strutil 3 4 String utilities
internal/syncutil 1 3 Concurrency utilities
internal/sys 1 4 Container detection
internal/testutil/mcptest 4 ~20 Test utilities and helpers
internal/timeutil 1 1 Time formatting
internal/tracing 2 13 OpenTelemetry tracing
internal/tty 1 2 Terminal detection
internal/version 1 2 Version management

Total: 104 non-test Go files, 729 functions cataloged


Identified Issues

1. Outlier Function: validateGuardPolicies in Wrong File (Persistent — 4th analysis)

Issue: validateGuardPolicies is a config-level validation function sitting in guard_policy.go but belongs in validation.go alongside all other top-level config validators.

  • File: internal/config/guard_policy.go:780

  • Function: func validateGuardPolicies(cfg *Config) error

  • Body:

    func validateGuardPolicies(cfg *Config) error {
        logGuardPolicy.Printf("Validating guard policies: count=%d", len(cfg.Guards))
        for name, guardCfg := range cfg.Guards {
            if guardCfg != nil && guardCfg.Policy != nil {
                if err := ValidateGuardPolicy(guardCfg.Policy); err != nil {
                    return fmt.Errorf("invalid policy for guard '%s': %w", name, err)
                }
            }
        }
        return nil
    }
  • Issue: Takes a *Config argument and validates guard policies across the entire config — the same contract as all other top-level validators in validation.go:

    • validateGatewayConfig (line 455)
    • validateTrustedBots (line 524)
    • validateTOMLStdioContainerization (line 541)
    • validateOpenTelemetryConfig (line 569)
    • validateAuthConfig (line 264)

    Both config_core.go:424 and config_stdin.go:363 call validateGuardPolicies in the same validation cascade as the validation.go functions, yet it lives in a different file.

  • Recommendation: Move validateGuardPolicies to internal/config/validation.go. Replace the logGuardPolicy reference with logValidation (already present in validation.go).

  • Estimated effort: ~30 minutes (move function body; no interface changes; all callers are in the same package)

  • Estimated impact: All config-level validators in one canonical file; improved contributor discoverability.


2. Thin Private Wrapper: truncateForLog in proxy/graphql.go (Persistent)

Issue: truncateForLog is a one-line private wrapper around strutil.TruncateRunes. It adds a semantic label but no logic, and is only called twice in the same file.

  • File: internal/proxy/graphql.go:194
  • Function:
    // truncateForLog truncates s to at most maxRunes runes, for safe debug logging.
    func truncateForLog(s string, maxRunes int) string {
        return strutil.TruncateRunes(s, maxRunes)
    }
  • Call sites (both in graphql.go):
    logGraphQL.Printf("extractSearchQuery: found in variables: %q", truncateForLog(v, 80))
    logGraphQL.Printf("extractSearchQuery: found inline: %q", truncateForLog(m[1], 80))
  • Issue: The wrapper provides no logic beyond strutil.TruncateRunes. The strutil.TruncateRunes function itself is already descriptive. Similar logging elsewhere calls strutil.Truncate / strutil.TruncateRunes directly without a local alias.
  • Recommendation: Remove truncateForLog and call strutil.TruncateRunes directly at the two call sites.
  • Estimated effort: ~5 minutes
  • Estimated impact: Removes a trivial indirection; consistent with the rest of the codebase.

3. Near-Duplicate Rate-Limit-Reset Header Parsers (New)

Issue: Two functions in different packages perform almost identical operations — parsing the X-RateLimit-Reset Unix-timestamp header value into a time.Time.

server package proxy package
File internal/server/circuit_breaker.go:322 internal/proxy/handler.go:452
Name parseRateLimitResetHeader parseRateLimitReset

Implementations:

// server/circuit_breaker.go
func parseRateLimitResetHeader(value string) time.Time {
    if value == "" {
        return time.Time{}
    }
    unix, err := strconv.ParseInt(strings.TrimSpace(value), 10, 64)
    if err != nil {
        return time.Time{}
    }
    return time.Unix(unix, 0)
}

// proxy/handler.go
func parseRateLimitReset(value string) time.Time {
    if value == "" {
        return time.Time{}
    }
    unix, err := strconv.ParseInt(value, 10, 64)
    if err != nil {
        return time.Time{}
    }
    return time.Unix(unix, 0)
}

The only difference is that the server version calls strings.TrimSpace(). Both are unexported and each used only within their respective package.

  • Recommendation: Extract a shared ParseRateLimitResetHeader(value string) time.Time function to internal/httputil/httputil.go (the existing shared-HTTP-helpers package). Both packages already import from internal/*, so no circular dependency would be introduced.
  • Estimated effort: ~30 minutes
  • Estimated impact: Single source of truth for header parsing; server version's TrimSpace behavior would be consistently applied everywhere.

4. withLock Methods — Logger Types (Informational, No Action Needed)

Four logger types each define an identical 3-line withLock method. This remains an accepted Go idiom given the language's lack of method inheritance. No change needed.


Refactoring Recommendations

Priority 1: High Impact, Low Effort

Move validateGuardPolicies to validation.go

  • From: internal/config/guard_policy.go:780
  • To: internal/config/validation.go (alongside other validate* functions)
  • Replace logGuardPolicy with logValidation
  • Effort: ~30 minutes

Extract shared ParseRateLimitResetHeader to internal/httputil

  • Create ParseRateLimitResetHeader(value string) time.Time in internal/httputil/httputil.go
  • Update internal/server/circuit_breaker.go and internal/proxy/handler.go to call the shared function
  • Apply the strings.TrimSpace from the server version (defensive improvement)
  • Effort: ~30 minutes

Priority 2: Low Impact, Trivial Effort

Remove truncateForLog wrapper in proxy/graphql.go

  • Replace 2 call sites with strutil.TruncateRunes(x, 80) directly
  • Delete the wrapper function (3 lines)
  • Effort: ~5 minutes

Implementation Checklist

  • Move validateGuardPolicies from guard_policy.go to validation.go (use logValidation)
  • Extract ParseRateLimitResetHeader to internal/httputil/httputil.go; update both call sites
  • Remove truncateForLog wrapper; call strutil.TruncateRunes directly at 2 sites in graphql.go
  • Run make agent-finished after changes to verify build and tests pass

Analysis Metadata

  • Total Go Files Analyzed: 104 (excluding test files)
  • Total Functions Cataloged: 729
  • Packages Examined: 24
  • Outliers Found: 1 (validateGuardPolicies in wrong file — persistent from prior runs)
  • Thin Wrappers Found: 1 (truncateForLog — persistent from prior run)
  • Near-Duplicate Pairs Found: 1 (parseRateLimitResetHeader / parseRateLimitReset — new)
  • Exact Duplicate Implementations: 0
  • Detection Method: Naming pattern analysis, cross-package grep, function signature review
  • Analysis Date: 2026-04-15

References:

Generated by Semantic Function Refactoring · ● 685.3K ·

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions