docs(typescript): add MCP security documentation and examples#825
docs(typescript): add MCP security documentation and examples#825jackbatzner wants to merge 4 commits intomicrosoft:mainfrom
Conversation
|
Welcome to the Agent Governance Toolkit! Thanks for your first pull request. |
🤖 AI Agent: security-scanner — Security Review of Pull RequestSecurity Review of Pull RequestThis pull request primarily introduces documentation and examples for the TypeScript MCP (Managed Control Plane) security primitives in the Findings1. Prompt Injection Defense Bypass
2. Policy Engine Circumvention
3. Credential Exposure
4. Sandbox Escape
5. Deserialization Attacks
6. Supply Chain Risks
Summary of Findings
Overall AssessmentWhile this PR is primarily focused on documentation and examples, it introduces several critical and high-severity risks due to insecure example code and insufficient safeguards in the governance pipeline. These issues could lead to prompt injection attacks, policy circumvention, credential exposure, and sandbox escapes if the examples are used as-is. Recommended Actions
This PR should not be merged until the critical and high-severity issues are addressed. |
There was a problem hiding this comment.
🤖 AI Agent: code-reviewer
Review Summary
This pull request introduces comprehensive documentation and examples for the TypeScript MCP (Managed Control Plane) security primitives, along with the addition of a new standalone package @microsoft/agentmesh-mcp-governance. The package includes several security-focused utilities such as credential redaction, rate limiting, HMAC signing, and policy enforcement.
The changes are well-structured and adhere to the project's style guidelines. However, there are some critical security concerns, potential breaking changes, and areas for improvement that need to be addressed.
🔴 CRITICAL: Security Issues
-
Regex Denial-of-Service (ReDoS) Risk in Credential Redactor
- The
CredentialRedactoruses several regex patterns (e.g.,SENSITIVE_KEY_PATTERN,BUILTIN_PATTERNS) to identify sensitive information. Some of these patterns (e.g.,SENSITIVE_KEY_PATTERN) are vulnerable to catastrophic backtracking, which could lead to a denial-of-service (DoS) attack if an attacker crafts malicious input. - Action: Use regex libraries like
safe-regexto validate patterns or refactor the regex to avoid catastrophic backtracking.
- The
-
Fail-Closed Behavior in Gateway
- While the
McpGatewayattempts to fail-closed on errors (e.g.,getSafeErrorMessage), there are scenarios where exceptions infinalizeorsafeRedactParamscould result in incomplete auditing or incorrect decisions. - Action: Add more robust error handling and logging to ensure that all failures are properly captured and do not affect the security posture.
- While the
-
Credential Redaction in Nested Objects
- The
CredentialRedactordoes not handle deeply nested objects efficiently, which could lead to partial redaction or missed sensitive data. - Action: Add a depth limit or recursion guard to prevent excessive processing and ensure complete redaction.
- The
-
Blocked Patterns in Gateway
- The
checkSanitizationmethod inMcpGatewayuses both string matching and regex patterns to identify blocked patterns. However, the use ofincludesfor string matching might miss edge cases where the blocked pattern is part of a larger string. - Action: Normalize and tokenize input data before applying string or regex matching to ensure comprehensive sanitization.
- The
🟡 WARNING: Potential Breaking Changes
-
Public API Instability
- The
@microsoft/agentmesh-mcp-governancepackage is marked as a "Public Preview," and the documentation explicitly states that APIs may change before GA. This could lead to breaking changes for early adopters. - Action: Clearly communicate API stability expectations and provide migration guides for future changes.
- The
-
Default Configuration Assumptions
- The
McpGatewayandCredentialRedactorrely on default configurations (e.g.,BUILTIN_PATTERNS,DEFAULT_REPLACEMENT). Changes to these defaults in future releases could break existing integrations. - Action: Document default configurations explicitly and consider versioning them to avoid breaking changes.
- The
💡 Suggestions for Improvement
-
Test Coverage
- While the
jest.config.jsfile indicates that tests are included, the diff does not show any test files. Ensure that all critical components, especiallyCredentialRedactorandMcpGateway, have comprehensive unit tests. - Action: Add tests for edge cases, such as deeply nested objects, malformed inputs, and timeouts in regex scanning.
- While the
-
Performance Optimization
- The
CredentialRedactorandMcpGatewayrely heavily on regex operations, which can be computationally expensive. For high-throughput scenarios, this could become a bottleneck. - Action: Benchmark the performance of regex operations and consider optimizing or replacing them with more efficient algorithms if necessary.
- The
-
Documentation Enhancements
- The documentation is comprehensive but could benefit from additional details on:
- How to implement durable storage for production deployments.
- Examples of custom
policyEvaluatorandapprovalHandlerimplementations. - Security considerations for each utility (e.g., potential risks of misconfiguration).
- Action: Expand the documentation to include these details.
- The documentation is comprehensive but could benefit from additional details on:
-
TypeScript Types
- The
CredentialRedactorandMcpGatewayuse several custom types (e.g.,CredentialRedaction,McpGatewayDecision). Ensure that these types are well-documented and cover all edge cases. - Action: Add JSDoc comments to all custom types and interfaces.
- The
-
Rate Limiter Configuration
- The
McpSlidingRateLimiteruses a sliding window algorithm, but the configuration options (e.g.,maxRequests,windowMs) are not validated. - Action: Add validation for rate limiter configurations to prevent misconfigurations.
- The
Final Assessment
- Security: 🔴 Address critical issues like ReDoS vulnerabilities and fail-closed behavior.
- Backward Compatibility: 🟡 Ensure clear communication about API stability and provide migration guides.
- Documentation: 💡 Expand documentation to cover advanced use cases and security considerations.
- Testing: 💡 Add comprehensive unit tests for all new functionality.
- Performance: 💡 Benchmark and optimize regex operations if necessary.
Once the critical issues are resolved and the suggested improvements are implemented, this PR will be ready for merging.
There was a problem hiding this comment.
🤖 AI Agent: code-reviewer
Pull Request Review: docs(typescript): add MCP security documentation and examples
This pull request introduces comprehensive documentation for the TypeScript MCP (Managed Control Plane) security primitives, including API references, OWASP MCP Cheat Sheet compliance mapping, quickstart guides, and usage examples. It also includes the implementation of the standalone @microsoft/agentmesh-mcp-governance package.
Key Areas of Review
🔴 CRITICAL: Security Issues
-
Credential Redaction Logic
- The
CredentialRedactorclass uses regex patterns to identify and redact sensitive information. While the provided patterns are comprehensive, there is a risk of false negatives (missed sensitive data) due to incomplete or overly specific regex patterns.- Actionable Recommendation: Add a mechanism to log unredacted sensitive data for debugging purposes in a secure environment. This will help identify gaps in the regex patterns.
- Actionable Recommendation: Regularly review and update the
BUILTIN_PATTERNSto include new credential formats as they emerge.
- The
-
Regex Denial-of-Service (ReDoS)
- The use of unbounded regex patterns (e.g.,
BUILTIN_PATTERNSandSENSITIVE_KEY_PATTERN) poses a risk of ReDoS attacks if an attacker provides maliciously crafted input.- Actionable Recommendation: Use regex libraries that support timeout or backtracking limits (e.g.,
re2in JavaScript/TypeScript) to mitigate this risk.
- Actionable Recommendation: Use regex libraries that support timeout or backtracking limits (e.g.,
- The use of unbounded regex patterns (e.g.,
-
Fail-Closed Behavior
- The
McpGatewayclass implements fail-closed behavior in case of errors, which is a good security practice. However, the error messages returned to the user (e.g., "Internal error - access denied (fail-closed)") could potentially leak information about the system's internal state.- Actionable Recommendation: Ensure that error messages do not expose sensitive implementation details. Use generic error messages for external users and log detailed errors internally.
- The
-
Approval Handler
- The
approvalHandlerfunction in theMcpGatewayclass is responsible for determining whether a tool call is approved. However, the default behavior is to returnApprovalStatus.Pendingif no handler is provided.- Actionable Recommendation: Explicitly document that the absence of an
approvalHandlerwill result in all sensitive tools requiring manual approval. Consider adding a configuration option to enforce stricter defaults.
- Actionable Recommendation: Explicitly document that the absence of an
- The
-
Blocked Patterns
- The
checkSanitizationmethod inMcpGatewayuses both user-defined and built-in patterns to block dangerous input. However, the built-in patterns may not cover all potential attack vectors.- Actionable Recommendation: Regularly review and expand the
BUILTIN_DANGEROUS_PATTERNSto include new attack patterns. Consider allowing users to provide additional patterns via configuration.
- Actionable Recommendation: Regularly review and expand the
- The
🟡 WARNING: Potential Breaking Changes
-
Public API Instability
- The
@microsoft/agentmesh-mcp-governancepackage is marked as a "Public Preview" with a disclaimer that APIs may change before GA. This could lead to breaking changes for early adopters.- Actionable Recommendation: Clearly communicate the expected timeline for API stabilization and provide migration guides for any breaking changes.
- The
-
Default Behavior Changes
- The default behavior of the
McpGatewayclass (e.g., requiring manual approval for sensitive tools) may not align with all use cases.- Actionable Recommendation: Provide clear documentation and examples to help users configure the gateway according to their specific needs.
- The default behavior of the
💡 Suggestions for Improvement
-
Documentation Enhancements
- The documentation is comprehensive but could benefit from additional examples demonstrating:
- How to implement a custom
approvalHandler. - How to extend the
BUILTIN_PATTERNSwith custom patterns. - Best practices for configuring the
McpGatewayin production environments.
- How to implement a custom
- The documentation is comprehensive but could benefit from additional examples demonstrating:
-
Test Coverage
- While the PR includes a Jest configuration file, there is no evidence of test cases for the newly introduced classes (
CredentialRedactor,McpGateway, etc.).- Actionable Recommendation: Add comprehensive test cases to validate the behavior of the
CredentialRedactorandMcpGatewayclasses, especially edge cases like:- Inputs with no sensitive data.
- Inputs with overlapping sensitive patterns.
- Malicious inputs designed to exploit ReDoS vulnerabilities.
- Actionable Recommendation: Add comprehensive test cases to validate the behavior of the
- While the PR includes a Jest configuration file, there is no evidence of test cases for the newly introduced classes (
-
Concurrency and Thread Safety
- The
McpGatewayclass uses an in-memory rate limiter (McpSlidingRateLimiter) by default. This may not be thread-safe in multi-threaded or distributed environments.- Actionable Recommendation: Clearly document the limitations of the in-memory rate limiter and recommend using a distributed rate-limiting solution (e.g., Redis) for production deployments.
- The
-
Type Safety
- The use of
unknownin theredactNodemethod ofCredentialRedactorcould lead to runtime errors if the input does not conform to the expected structure.- Actionable Recommendation: Use stricter TypeScript types and runtime validation (e.g.,
zodorio-ts) to ensure type safety.
- Actionable Recommendation: Use stricter TypeScript types and runtime validation (e.g.,
- The use of
-
Performance Considerations
- The
createRegexScanBudgetfunction is used to prevent excessive regex execution time. However, the defaultscanTimeoutMsis set to 100ms, which may be too low for large inputs.- Actionable Recommendation: Provide guidance on how to tune
scanTimeoutMsbased on the expected input size and system performance.
- Actionable Recommendation: Provide guidance on how to tune
- The
-
Internationalization (i18n)
- While the PR mentions i18n translations, there is no evidence of actual i18n support in the provided code or documentation.
- Actionable Recommendation: Clarify whether i18n support is planned for a future release or if it is already implemented. If the latter, include examples in the documentation.
- While the PR mentions i18n translations, there is no evidence of actual i18n support in the provided code or documentation.
Summary of Feedback
- 🔴 CRITICAL: Address potential security issues, including ReDoS vulnerabilities and gaps in credential redaction patterns.
- 🟡 WARNING: Highlight potential breaking changes due to API instability and default behavior.
- 💡 SUGGESTION: Improve documentation, add test coverage, ensure thread safety, enhance type safety, and clarify i18n support.
By addressing these issues and suggestions, the @microsoft/agentmesh-mcp-governance package can be made more robust, secure, and user-friendly. Let me know if you need further clarification or assistance!
There was a problem hiding this comment.
🤖 AI Agent: code-reviewer
Code Review Feedback for Pull Request
General Observations
This pull request adds significant documentation and examples for the TypeScript MCP security primitives, which is a valuable addition to the project. However, there are several areas that require attention, particularly concerning security, correctness, and potential breaking changes.
🔴 CRITICAL Issues
-
Credential Handling in
CredentialRedactor:- The
CredentialRedactorclass is responsible for redacting sensitive information. Ensure that the regex patterns used for redaction are comprehensive and do not allow for bypassing. For example, patterns like/(password|passwd|pwd|secret|token|api[_-]?key|connection.?string|accountkey|sharedaccesssignature|sas)/ishould be reviewed to ensure they cover all potential sensitive keys. Consider adding unit tests that cover edge cases to validate the effectiveness of these patterns.
- The
-
Error Handling in
McpGateway:- The
evaluateToolCallmethod catches errors but returns a generic message. This could potentially leak sensitive information if the error message is not sanitized properly. Ensure that all error messages returned to the user do not expose internal state or sensitive data.
- The
-
Replay Protection in
MCPMessageSigner:- The
MCPMessageSignerclass should implement robust replay protection mechanisms. Ensure that timestamps and nonces are validated correctly to prevent replay attacks. Review the implementation to confirm that it adheres to best practices for cryptographic operations.
- The
🟡 WARNING Issues
-
Public Preview Notice:
- The documentation states that the MCP governance package is in public preview and that APIs may change before GA. This could lead to breaking changes for users who adopt this package early. Ensure that this is communicated clearly in the documentation and consider versioning strategies to manage breaking changes effectively.
-
Potential API Changes:
- The addition of new classes and methods (e.g.,
MCPGateway,CredentialRedactor) may introduce breaking changes if existing consumers rely on the previous structure. Ensure that backward compatibility is maintained or document any changes clearly in the release notes.
- The addition of new classes and methods (e.g.,
💡 SUGGESTION Improvements
-
Documentation Enhancements:
- While the documentation is comprehensive, consider adding more examples that cover common use cases and edge cases. This will help users understand how to implement the security primitives effectively.
-
Unit Tests:
- Ensure that there are adequate unit tests for all new classes and methods introduced in this PR. Focus on testing the security aspects, such as the effectiveness of the
CredentialRedactorand the decision-making process inMcpGateway.
- Ensure that there are adequate unit tests for all new classes and methods introduced in this PR. Focus on testing the security aspects, such as the effectiveness of the
-
Type Safety:
- Review the use of TypeScript types throughout the new code. Ensure that all public APIs are well-typed and that any potential type mismatches are handled gracefully. This will improve the overall robustness of the library.
-
Performance Considerations:
- The regex patterns used for redaction and sanitization could impact performance, especially with large inputs. Consider benchmarking these operations and optimizing where necessary.
-
Concurrency Handling:
- If the
McpGatewayor any other classes are expected to be used in a concurrent environment, ensure that they are thread-safe. Review shared state and mutable data structures to prevent race conditions.
- If the
Conclusion
This pull request is a significant step forward in enhancing the documentation and usability of the MCP security primitives. However, addressing the critical security issues and potential breaking changes is paramount before merging. Implementing the suggested improvements will further strengthen the library's reliability and user experience.
There was a problem hiding this comment.
🤖 AI Agent: code-reviewer
Review Summary
This pull request introduces documentation and examples for the TypeScript MCP (Multi-Channel Policy) security primitives, alongside the implementation of the @microsoft/agentmesh-mcp-governance package. The package provides standalone MCP governance primitives, including credential redaction, rate limiting, message signing, and gateway enforcement. The PR also includes configuration files for testing, linting, and building the package.
The implementation appears well-structured and adheres to modern TypeScript practices. However, there are some areas of concern, particularly around security, type safety, and potential backward compatibility issues.
🔴 CRITICAL: Security Issues
-
Regex Denial of Service (ReDoS) Risk in Credential Redactor
- The
CredentialRedactorclass uses multiple regex patterns, some of which are complex (e.g.,pem_blockandconnection_string). These patterns could potentially lead to catastrophic backtracking, especially when processing large inputs. - Actionable Recommendation: Use a library like
safe-regexto validate the safety of these regex patterns. Alternatively, consider optimizing the patterns or using a streaming approach to process large inputs incrementally.
- The
-
Fail-Closed Behavior in Gateway
- The
evaluateToolCallmethod inMcpGatewayattempts to fail closed on errors, but the implementation offinalizecould still allow partial failures (e.g., audit sink or metrics recording errors) to affect the decision-making process. - Actionable Recommendation: Ensure that all failure scenarios in
finalizeexplicitly return a fail-closed decision, even if audit or metrics recording fails.
- The
-
Insufficient Logging for Security Events
- The
McpGatewayclass lacks detailed logging for critical security events, such as when a tool call is denied due to policy violations or rate limiting. - Actionable Recommendation: Add structured logging for all critical security decisions, including the reason for denial and the agent/tool involved. Ensure logs are sanitized to avoid leaking sensitive information.
- The
-
Insecure Default Configurations
- The
McpGatewayclass uses aNoopMcpMetricsinstance as the default for metrics. This could lead to a lack of visibility into security events if the user does not explicitly configure a metrics implementation. - Actionable Recommendation: Either require a metrics implementation to be provided or log a warning when the default
NoopMcpMetricsis used.
- The
-
Hardcoded Sensitive Key Patterns
- The
CredentialRedactorclass uses a hardcoded regex (SENSITIVE_KEY_PATTERN) to identify sensitive keys. This approach may miss custom or less common key patterns. - Actionable Recommendation: Allow users to configure additional sensitive key patterns via the
CredentialRedactorConfig.
- The
🟡 WARNING: Potential Breaking Changes
-
API Stability
- The package is marked as "Public Preview," and the documentation explicitly states that APIs may change before GA. This could lead to breaking changes for early adopters.
- Actionable Recommendation: Clearly document any planned changes to the API and provide migration guides for users.
-
Default Behavior Changes
- The
McpGatewayclass has several default behaviors (e.g.,NoopMcpMetrics, emptyallowedTools, etc.) that may change in the future. This could lead to unexpected behavior for users relying on the current defaults. - Actionable Recommendation: Document these defaults clearly and consider adding deprecation warnings if they are likely to change.
- The
💡 Suggestions for Improvement
-
Type Safety
- The
safeRedactParamsmethod inMcpGatewayuses atry-catchblock to handle errors during redaction, but it returns an empty object in case of failure. This could lead to silent failures. - Suggestion: Log the error and provide a more descriptive fallback value or rethrow the error with additional context.
- The
-
Test Coverage
- While the PR includes a Jest configuration, there is no evidence of test cases for critical components like
CredentialRedactorandMcpGateway. - Suggestion: Add comprehensive test cases to cover edge cases, especially for security-critical features like regex patterns, rate limiting, and policy enforcement.
- While the PR includes a Jest configuration, there is no evidence of test cases for critical components like
-
Documentation
- The documentation is thorough but could benefit from additional examples demonstrating how to implement durable storage for session, nonce, rate-limit, and audit data.
- Suggestion: Include a section in the README with links to example implementations or third-party libraries that can be used for durable storage.
-
Rate Limiter Configuration
- The
McpGatewayclass allows for a custom rate limiter, but there is no validation to ensure that the provided rate limiter adheres to the expected interface. - Suggestion: Use TypeScript's type system to enforce the expected interface for custom rate limiters.
- The
-
Regex Timeout Handling
- The
createRegexScanBudgetutility is used to enforce timeouts for regex scans, but the timeout behavior is not configurable at the individual pattern level. - Suggestion: Allow users to specify custom timeouts for individual regex patterns, especially for complex patterns.
- The
Final Notes
The PR introduces a valuable addition to the agent-mesh ecosystem, focusing on security and governance. However, the identified security issues and areas for improvement should be addressed before merging to ensure robustness and reliability.
- Priority Fixes: Address the critical security issues (ReDoS risk, fail-closed behavior, and logging).
- Next Steps: Enhance test coverage and documentation, and consider the suggestions for improving type safety and configurability.
Let me know if you need further clarification or assistance!
There was a problem hiding this comment.
🤖 AI Agent: code-reviewer
Review Summary
This pull request introduces documentation and examples for the TypeScript MCP security primitives, as well as the implementation of the @microsoft/agentmesh-mcp-governance package. The documentation is clear and comprehensive, and the implementation appears to be well-structured. However, there are several areas that require attention to ensure security, correctness, and maintainability.
🔴 CRITICAL: Security Issues
-
Regex Denial of Service (ReDoS) Vulnerability
-
The
BUILTIN_PATTERNSarray incredential-redactor.tscontains regular expressions that are not optimized for performance. For example:{ name: 'pem_block', pattern: /-----BEGIN [A-Z0-9 ]+-----[\s\S]*?-----END [A-Z0-9 ]+-----/g }
The use of
[\s\S]*?is prone to catastrophic backtracking, which can lead to ReDoS attacks. This needs to be optimized to avoid performance issues.Recommendation: Use more specific patterns or limit the quantifiers to avoid excessive backtracking. For example, use
.{0,1000}instead of[\s\S]*?if you know the maximum expected length.
-
-
Insufficient Logging for Security Events
-
The
debugSecurityFailurefunction is used to log security failures, but it is unclear if this function logs sufficient details for auditing purposes. For example, it is not clear if the function logs the full stack trace or other contextual information. -
Additionally, the
finalizemethod logs decisions but does not include the full context of the decision-making process, such as the exact parameters or patterns that triggered a denial.Recommendation: Ensure that all security-related events are logged with sufficient detail to facilitate auditing and debugging. Include stack traces, agent IDs, tool names, and matched patterns where applicable.
-
-
Fail-Closed Behavior
-
While the
evaluateToolCallmethod attempts to fail closed in case of errors, thefinalizemethod also has a fallback to fail closed. However, the fallback logic infinalizedoes not include all findings or contextual information.Recommendation: Ensure that the fallback logic in
finalizeincludes all available information, such as findings and the original parameters, to avoid losing critical context during error handling.
-
-
Potential Credential Exposure
-
The
safeRedactParamsmethod is used to redact sensitive information from parameters, but it is not clear if it is applied consistently across all logging and auditing operations.Recommendation: Conduct a thorough review to ensure that
safeRedactParamsis applied to all logging and auditing operations. Consider adding automated tests to verify this behavior.
-
🟡 WARNING: Potential Breaking Changes
- Public Preview APIs
-
The
@microsoft/agentmesh-mcp-governancepackage is marked as a "Public Preview" with the note that APIs may change before GA. This could lead to breaking changes for users who adopt the package early.Recommendation: Clearly document the potential for breaking changes in the README and consider providing a migration guide for future updates.
-
💡 Suggestions for Improvement
-
TypeScript Type Safety
-
The
evaluateToolCallmethod usesRecord<string, unknown>forparams. While this is flexible, it sacrifices type safety.Recommendation: Use a generic type parameter for
paramsto allow users to specify the expected structure of the parameters. For example:async evaluateToolCall<T extends Record<string, unknown>>( agentId: string, toolName: string, params: T = {} as T, ): Promise<McpGatewayDecision> {
-
-
Regex Timeout Configuration
-
The
scanTimeoutMsconfiguration is used to create a regex scan budget, but it is not clear how this value is determined or if it is sufficient for all use cases.Recommendation: Provide guidance in the documentation on how to configure
scanTimeoutMsbased on the expected size and complexity of the input.
-
-
Unit Test Coverage
-
While the PR includes a Jest configuration, there is no evidence of unit tests for the newly introduced functionality.
Recommendation: Add unit tests for all critical components, especially
CredentialRedactorandMcpGateway. Focus on edge cases, such as large inputs, invalid patterns, and error scenarios.
-
-
Documentation Enhancements
- The documentation is comprehensive but could benefit from additional details on the following:
- Examples of custom implementations for
auditSinkandrateLimiter. - A detailed explanation of the
policyEvaluatorandapprovalHandlerinterfaces, including example implementations. - A section on best practices for securing MCP deployments.
- Examples of custom implementations for
- The documentation is comprehensive but could benefit from additional details on the following:
-
Concurrency and Thread Safety
-
The
McpSlidingRateLimiterandCredentialRedactorclasses use in-memory storage, which may not be thread-safe in multi-threaded environments.Recommendation: Document the thread-safety limitations of in-memory implementations and provide guidance on using durable, thread-safe alternatives in production.
-
-
Backward Compatibility
-
The
@microsoft/agentmesh-mcp-governancepackage introduces new APIs, but it is not clear if these APIs are consistent with existing MCP security primitives.Recommendation: Ensure that the new APIs are consistent with existing MCP security primitives to minimize confusion for developers.
-
Conclusion
This PR introduces valuable documentation and functionality for MCP security primitives, but there are critical security issues that need to be addressed before merging. Additionally, there are opportunities to improve type safety, documentation, and test coverage. Please address the critical issues and consider the suggestions to enhance the quality and security of the implementation.
packages/agent-mesh/packages/mcp-governance/tests/credential-redactor.test.ts
Fixed
Show fixed
Hide fixed
There was a problem hiding this comment.
🤖 AI Agent: code-reviewer
Pull Request Review: docs(typescript): add MCP security documentation and examples
🔴 CRITICAL: Security Issues
-
Regex Denial of Service (ReDoS) Vulnerability
- The
CredentialRedactorandMcpGatewayclasses use user-defined and built-in regular expressions for scanning and redacting sensitive data. While thevalidateRegexfunction is invoked, it is unclear if it adequately protects against catastrophic backtracking in regex patterns. - Actionable Fix: Enhance
validateRegexto detect and reject potentially catastrophic regex patterns. Consider using libraries likesafe-regexorre2for safer regex evaluation.
- The
-
Fail-Closed Behavior in
McpGateway- The
evaluateToolCallmethod has a fail-closed mechanism for internal errors, which is good. However, thefinalizemethod also contains error handling that could potentially overwrite the fail-closed behavior if it fails. - Actionable Fix: Ensure that any failure in
finalizedoes not compromise the fail-closed behavior ofevaluateToolCall. Add explicit tests for these edge cases.
- The
-
Sensitive Key Redaction
- The
CredentialRedactorclass uses a regex (SENSITIVE_KEY_PATTERN) to identify sensitive keys. However, this regex may not cover all possible variations of sensitive key names. - Actionable Fix: Expand the
SENSITIVE_KEY_PATTERNto include more variations of sensitive key names, or allow users to provide additional patterns via configuration.
- The
-
Blocked Patterns in
McpGateway- The
checkSanitizationmethod inMcpGatewayuses both user-defined and built-in patterns to block dangerous input. However, the built-in patterns (BUILTIN_DANGEROUS_PATTERNS) may not cover all potential attack vectors. - Actionable Fix: Regularly update
BUILTIN_DANGEROUS_PATTERNSto include new attack patterns. Consider integrating with a threat intelligence feed for dynamic updates.
- The
🟡 WARNING: Potential Breaking Changes
-
Public Preview API
- The
@microsoft/agentmesh-mcp-governancepackage is marked as a "Public Preview" release, with a warning that APIs may change before GA. - Actionable Fix: Clearly document any planned breaking changes in the README and provide migration guides when transitioning to GA.
- The
-
Rate Limiter Behavior
- The
McpSlidingRateLimiterenforces rate limits but does not provide a mechanism for customizing the error response when limits are exceeded. - Actionable Fix: Allow users to configure custom error messages or actions when rate limits are hit.
- The
💡 Suggestions for Improvement
-
TypeScript Type Safety
- The
McpGatewayandCredentialRedactorclasses rely onRecord<string, unknown>for parameters. This weakens type safety. - Suggestion: Use stricter TypeScript types or generics to enforce type safety for parameters.
- The
-
Test Coverage
- While Jest is configured, there is no mention of test coverage for critical components like
CredentialRedactorandMcpGateway. - Suggestion: Add unit tests to cover edge cases, such as:
- Regex timeouts in
CredentialRedactor. - Fail-closed behavior in
McpGateway. - Handling of malformed or malicious input.
- Regex timeouts in
- While Jest is configured, there is no mention of test coverage for critical components like
-
Documentation
- The README provides a good overview but lacks detailed examples for advanced use cases, such as integrating with custom audit sinks or rate limiters.
- Suggestion: Add more advanced usage examples and a section on common pitfalls.
-
Performance Optimization
- The
redactNodemethod inCredentialRedactorrecursively processes objects and arrays, which could lead to performance issues with deeply nested structures. - Suggestion: Add a depth limit to prevent excessive recursion and document this behavior.
- The
-
Internationalization (i18n)
- The PR mentions i18n translations but does not include any i18n-specific files or configurations.
- Suggestion: Clarify the i18n strategy and include examples of localized error messages or documentation.
Summary of Recommendations
- 🔴 Address potential ReDoS vulnerabilities in regex patterns.
- 🔴 Ensure robust fail-closed behavior in
McpGateway. - 🔴 Expand sensitive key and blocked pattern detection.
- 🟡 Clearly document potential breaking changes for the public preview API.
- 💡 Improve type safety, test coverage, and documentation.
- 💡 Optimize recursive methods for performance and add depth limits.
- 💡 Clarify and implement i18n strategy.
This PR introduces important documentation and examples for the MCP security primitives, but the identified issues should be addressed to ensure the robustness and security of the implementation.
There was a problem hiding this comment.
🤖 AI Agent: code-reviewer
Pull Request Review: docs(typescript): add MCP security documentation and examples
🔴 CRITICAL Issues
-
Credential Redaction: Insufficient Handling of Sensitive Keys
- The
CredentialRedactorclass uses a regular expression (SENSITIVE_KEY_PATTERN) to identify sensitive keys. However, this approach is prone to false negatives and may miss keys with unconventional naming conventions. - Recommendation: Consider implementing a more robust mechanism for identifying sensitive keys, such as a configurable allowlist/denylist or leveraging machine learning models trained on common patterns of sensitive data.
- The
-
Regex DoS Vulnerability
- The
CredentialRedactorandMcpGatewayclasses use regular expressions extensively, including user-configurable patterns. While there is avalidateRegexfunction, the implementation of this function is not included in the diff, making it unclear if it adequately protects against ReDoS (Regular Expression Denial of Service) attacks. - Recommendation: Ensure that
validateRegexincludes safeguards against catastrophic backtracking. Additionally, consider implementing a timeout mechanism for regex execution to prevent potential DoS attacks.
- The
-
Fail-Closed Behavior in Gateway
- The
evaluateToolCallmethod inMcpGatewayfails closed when an error occurs, which is a good security practice. However, the error message returned to the user (Internal error - access denied (fail-closed)) may inadvertently leak implementation details if thegetSafeErrorMessagefunction is not properly sanitizing error messages. - Recommendation: Review the implementation of
getSafeErrorMessageto ensure it does not expose sensitive internal details.
- The
-
Replay Protection in
MCPMessageSigner- The
MCPMessageSigneris mentioned as providing HMAC-SHA256 payload signing with timestamp and nonce replay protection. However, there is no evidence in the diff that nonce storage or validation is implemented. - Recommendation: Ensure that nonce replay protection is implemented with a durable storage mechanism to prevent replay attacks.
- The
🟡 WARNING: Potential Breaking Changes
-
Public API Stability
- The
@microsoft/agentmesh-mcp-governancepackage is marked as a "Public Preview" with a warning that APIs may change before GA. This could lead to breaking changes for users who adopt the package early. - Recommendation: Clearly document the expected timeline for API stabilization and provide migration guides for any breaking changes.
- The
-
Default Behavior of
CredentialRedactor- The
CredentialRedactorclass defaults to redacting sensitive keys (redactSensitiveKeys: true). This could lead to unexpected behavior for users who are unaware of this default setting. - Recommendation: Explicitly document this behavior in the
README.mdand consider requiring users to explicitly enable this feature to avoid surprises.
- The
💡 Suggestions for Improvement
-
Documentation Enhancements
- The
README.mdprovides a good overview of the package, but it could benefit from additional details:- Add a section on how to configure and use custom patterns in the
CredentialRedactor. - Provide examples of durable storage implementations for session, nonce, rate-limit, and audit storage.
- Include a security considerations section to highlight best practices and potential pitfalls.
- Add a section on how to configure and use custom patterns in the
- The
-
Testing Coverage
- The
jest.config.jsfile specifies test coverage collection, but there is no evidence of tests for critical components likeCredentialRedactorandMcpGateway. - Recommendation: Add comprehensive unit and integration tests for these components, focusing on edge cases and security scenarios (e.g., ReDoS, replay attacks, and fail-closed behavior).
- The
-
Thread Safety
- The
McpGatewayclass uses shared resources likerateLimiterandauditSink. There is no indication of thread safety mechanisms (e.g., locks or atomic operations) to prevent race conditions in concurrent environments. - Recommendation: Review the implementation of shared resources to ensure thread safety, especially in high-concurrency scenarios.
- The
-
Type Safety
- The use of TypeScript provides a strong foundation for type safety. However, the
evaluateToolCallmethod inMcpGatewayacceptsparamsasRecord<string, unknown>, which could lead to runtime errors if the structure ofparamsis not validated. - Recommendation: Use a schema validation library (e.g.,
zodorajv) to validate the structure ofparamsat runtime.
- The use of TypeScript provides a strong foundation for type safety. However, the
-
OWASP MCP Cheat Sheet Compliance
- The
README.mdmentions OWASP MCP Cheat Sheet compliance but does not provide a detailed mapping of how the package addresses specific recommendations. - Recommendation: Add a dedicated section in the documentation that maps package features to OWASP MCP Cheat Sheet recommendations.
- The
Summary
This PR introduces a new TypeScript package for MCP governance primitives, along with comprehensive documentation. While the implementation demonstrates strong security principles (e.g., fail-closed behavior), there are critical issues related to regex safety, sensitive data handling, and replay protection that need to be addressed. Additionally, the documentation and testing coverage could be improved to enhance usability and reliability.
Action Items:
- Address the critical issues related to regex safety, sensitive data handling, and replay protection.
- Improve documentation with additional examples, security considerations, and OWASP compliance mapping.
- Add comprehensive tests for critical components.
- Review and ensure thread safety for shared resources.
- Validate input parameters using a schema validation library.
Once these issues are resolved, the package will be better positioned for a stable release.
There was a problem hiding this comment.
🤖 AI Agent: code-reviewer
Pull Request Review for microsoft/agent-governance-toolkit
Focus Areas
- Policy Engine Correctness
- Trust/Identity: Cryptographic Operations, Credential Handling
- Sandbox Escape Vectors
- Thread Safety in Concurrent Agent Execution
- OWASP Agentic Top 10 Compliance
- Type Safety and Pydantic Model Validation
- Backward Compatibility
Feedback
🔴 CRITICAL: Security Issues
-
Credential Redaction Timeout Handling
- The
CredentialRedactorclass uses ascanTimeoutMsparameter to limit regex scanning time. However, if the timeout is exceeded, thecreateRegexScanBudgetfunction appears to block content entirely (Regex scan exceeded time budget - content blocked). This fail-closed behavior is good, but:- Issue: The error message does not propagate to the caller, making debugging difficult.
- Impact: This could lead to silent failures in production where redaction fails without clear visibility.
- Action: Ensure that timeout-related errors are logged or included in the
CredentialRedactionResultfor better observability.
- The
-
Regex Denial-of-Service (ReDoS)
- The
CredentialRedactorandMcpGatewayclasses rely heavily on regex patterns for scanning sensitive data and blocked patterns. WhilevalidateRegexis used to validate patterns, there is no explicit check for catastrophic backtracking vulnerabilities in user-supplied regexes.- Impact: A maliciously crafted regex could cause a denial-of-service (ReDoS) attack.
- Action: Use a library like
safe-regexto validate regex patterns for ReDoS vulnerabilities before execution.
- The
-
HMAC-SHA256 Secret Management
- The
MCPMessageSignerusesprocess.env.MCP_SIGNING_SECRETfor HMAC-SHA256 signing. However, there is no validation to ensure the secret meets minimum entropy requirements.- Impact: Weak secrets could compromise the integrity of signed messages.
- Action: Add a validation step to ensure the secret is sufficiently long (e.g., at least 32 characters) and random.
- The
-
Approval Handler Default Behavior
- In the
McpGatewayclass, if noapprovalHandleris provided, the default behavior is to set the approval status toPending. This could lead to unintended denial of service if the approval process is not properly configured.- Impact: Agents may be blocked unnecessarily due to misconfiguration.
- Action: Log a warning if
approvalHandleris not configured, and provide a clear error message in the decision response.
- In the
🟡 WARNING: Potential Breaking Changes
-
Public API Stability
- The
@microsoft/agentmesh-mcp-governancepackage is marked as a "Public Preview" release, with a disclaimer that APIs may change before GA. However, no versioning strategy for breaking changes is outlined.- Impact: Consumers of the package may face unexpected breaking changes.
- Action: Clearly document a versioning strategy (e.g., semantic versioning) and provide deprecation warnings for any planned breaking changes.
- The
-
Default Configuration Behavior
- The
McpGatewayclass has several default configurations (e.g.,allowedTools,deniedTools,rateLimiter). If these defaults are changed in future releases, it could break existing integrations.- Impact: Consumers relying on default behavior may experience unexpected failures.
- Action: Document all default configurations explicitly in the README and provide migration guides for future changes.
- The
💡 Suggestions for Improvement
-
Enhanced Logging
- The
debugSecurityFailurefunction is used for logging errors in theMcpGatewayclass. However, it is unclear if this function integrates with a centralized logging system.- Suggestion: Ensure that all security-related logs are structured and compatible with common logging frameworks (e.g., Azure Monitor, ELK stack).
- The
-
Rate Limiter Customization
- The
McpSlidingRateLimiterclass enforces a fixed sliding window rate limit. However, there is no support for dynamic rate limits based on agent behavior or priority.- Suggestion: Consider adding support for dynamic rate limits, e.g., based on agent roles or historical usage patterns.
- The
-
Credential Redaction Preview
- The
truncatePreviewfunction is used to generate previews of redacted content. However, the truncation length is not configurable.- Suggestion: Allow the truncation length to be configurable via the
CredentialRedactorConfig.
- Suggestion: Allow the truncation length to be configurable via the
- The
-
Documentation Enhancements
- The README for the
@microsoft/agentmesh-mcp-governancepackage provides a good overview but lacks detailed examples for advanced use cases (e.g., custompolicyEvaluatororapprovalHandlerimplementations).- Suggestion: Add more detailed examples for advanced configurations and real-world scenarios.
- The README for the
-
TypeScript Type Safety
- The TypeScript types in the
McpGatewayandCredentialRedactorclasses are well-defined. However, some types (e.g.,McpGatewayDecision) use optional properties (approvalStatus,rateLimit) that could lead to runtime errors if not handled properly.- Suggestion: Use stricter type definitions or provide utility functions to safely access optional properties.
- The TypeScript types in the
-
Internationalization (i18n)
- The documentation mentions i18n translations but does not provide any examples or details.
- Suggestion: Include examples of how to use i18n features, especially for error messages and logs.
- The documentation mentions i18n translations but does not provide any examples or details.
-
Test Coverage
- The
jest.config.jsfile specifiescollectCoverageFrom: ['src/**/*.ts'], but there is no mention of the current test coverage percentage.- Suggestion: Add a CI step to enforce a minimum test coverage threshold (e.g., 80%) and fail the build if coverage drops below this threshold.
- The
Summary
-
Strengths:
- The
McpGatewayandCredentialRedactorclasses demonstrate a strong focus on security and fail-closed behavior. - The use of TypeScript ensures type safety and reduces the likelihood of runtime errors.
- The documentation is comprehensive and provides a good starting point for users.
- The
-
Areas for Improvement:
- Address potential security vulnerabilities (e.g., ReDoS, weak secrets).
- Improve logging and observability for debugging and monitoring.
- Provide more detailed examples and documentation for advanced use cases.
-
Action Items:
- 🔴 Address critical security issues (e.g., ReDoS, secret validation).
- 🟡 Document a versioning strategy and default configuration behavior to avoid breaking changes.
- 💡 Enhance logging, rate limiter customization, and test coverage enforcement.
Let me know if you need further clarification or assistance!
There was a problem hiding this comment.
🤖 AI Agent: code-reviewer
Pull Request Review for microsoft/agent-governance-toolkit
Focus Areas Review:
Policy Engine Correctness
-
🔴 CRITICAL: The
McpGatewayclass usesArray.prototype.includesfor checkingdeniedToolsandallowedTools. This approach is case-sensitive. If tool names are provided in mixed cases, this could lead to false negatives, allowing unauthorized tools to bypass the policy. Consider normalizing tool names (e.g., converting to lowercase) before performing checks. -
💡 SUGGESTION: The
evaluateToolCallmethod does not explicitly handle cases whereallowedToolsanddeniedToolsoverlap. While the current implementation prioritizesdeniedTools, it would be clearer to explicitly document or enforce this behavior. -
💡 SUGGESTION: The
blockedPatternsconfiguration inMcpGatewaydoes not combine user-defined patterns with theBUILTIN_DANGEROUS_PATTERNS. Consider merging these lists to ensure both built-in and custom patterns are checked.
Trust/Identity: Cryptographic Operations, Credential Handling, SPIFFE/SVID
-
🔴 CRITICAL: The
MCPMessageSignerclass uses HMAC-SHA256 for signing payloads. However, thesecretused for signing is fetched fromprocess.env.MCP_SIGNING_SECRETwithout validation. Ensure the secret meets minimum entropy requirements (e.g., 256 bits) and is validated during initialization. -
💡 SUGGESTION: The
CredentialRedactorclass uses a hardcoded list of patterns for sensitive data detection. While this is a good starting point, it might miss other sensitive patterns. Consider allowing users to dynamically update or extend theBUILTIN_PATTERNSat runtime.
Sandbox Escape Vectors
-
🔴 CRITICAL: The
McpGatewayclass does not sanitize or validate theparamsobject before processing. If this object is user-controlled, it could potentially contain malicious payloads. Ensure deep validation and sanitization ofparamsbefore use. -
💡 SUGGESTION: The
blockedPatternsfeature inMcpGatewayis a good safeguard, but it relies on regular expressions. Ensure that these patterns are thoroughly tested to avoid bypasses or performance issues (e.g., ReDoS attacks).
Thread Safety in Concurrent Agent Execution
-
💡 SUGGESTION: The
CredentialRedactorclass uses aWeakMapto track seen objects during recursive redaction. While this prevents infinite loops, it may not be thread-safe in multi-threaded environments. Consider documenting this limitation or implementing thread-safe alternatives. -
💡 SUGGESTION: The
McpSlidingRateLimiterclass appears to handle rate-limiting in memory. For distributed systems, ensure that rate-limiting is implemented using a shared, thread-safe store (e.g., Redis) to avoid inconsistencies.
OWASP Agentic Top 10 Compliance
-
🔴 CRITICAL: The
McpGatewayclass does not appear to enforce strong input validation forparams. This could lead to injection attacks (e.g., command injection, SQL injection). Implement robust input validation mechanisms for all user-controlled inputs. -
🔴 CRITICAL: The
McpGatewayclass does not enforce a maximum size for theparamsobject. This could lead to denial-of-service (DoS) attacks if excessively large payloads are submitted. Add a size limit forparams. -
💡 SUGGESTION: The
CredentialRedactorclass uses regular expressions for sensitive data detection. Ensure these patterns are tested against adversarial inputs to prevent bypasses or performance degradation (e.g., ReDoS).
Type Safety and Pydantic Model Validation
- 💡 SUGGESTION: While this PR is focused on TypeScript, consider adding runtime validation for critical inputs (e.g.,
paramsinMcpGateway) using libraries likezodorio-ts. This would complement TypeScript's compile-time checks.
Backward Compatibility (Public API Changes)
- 🟡 WARNING: The
@microsoft/agentmesh-mcp-governancepackage is marked as a "Public Preview" release. Ensure that any breaking changes are clearly documented and communicated to users before the General Availability (GA) release.
Additional Observations
-
💡 SUGGESTION: The
CredentialRedactorclass uses a default replacement string of[REDACTED]. Consider allowing users to configure this string globally for consistency across the application. -
💡 SUGGESTION: The
McpGatewayclass uses a fail-closed approach for errors, which is a good practice. However, consider adding more granular error logging to help developers debug issues during policy evaluation. -
💡 SUGGESTION: The
jest.config.jsfile specifies acoverageDirectorybut does not enforce a minimum coverage threshold. Consider adding acoverageThresholdconfiguration to ensure critical paths are adequately tested.
Summary of Recommendations
- 🔴 Address critical issues related to policy engine correctness, cryptographic operations, and sandbox escape vectors.
- 🟡 Document and communicate potential breaking changes in the public API.
- 💡 Implement suggested improvements for input validation, thread safety, and test coverage.
Let me know if you need further clarification or assistance!
| app.post('/call-tool', async (request, response) => { | ||
| const agentId = readString(request.body, 'agentId') ?? 'demo-agent'; | ||
| const toolName = readString(request.body, 'toolName') ?? ''; | ||
| const args = asRecord(request.body?.args); | ||
| const sessionToken = request.header('x-session-token'); | ||
| const handler = toolHandlers[toolName]; | ||
|
|
||
| if (!sessionToken) { | ||
| response.status(401).json({ error: 'Missing x-session-token. Call GET /health for a demo token.' }); | ||
| return; | ||
| } | ||
| if (!handler) { | ||
| response.status(404).json({ error: `Unknown tool '${toolName}'` }); | ||
| return; | ||
| } | ||
|
|
||
| const session = await sessionAuthenticator.verifyToken(sessionToken, agentId); | ||
| if (!session.valid) { | ||
| response.status(401).json({ error: session.reason }); | ||
| return; | ||
| } | ||
|
|
||
| const signedCall = messageSigner.sign({ agentId, toolName, args }); | ||
| const signature = await messageSigner.verify(signedCall); | ||
| if (!signature.valid) { | ||
| response.status(401).json({ error: signature.reason }); | ||
| return; | ||
| } | ||
|
|
||
| const requestThreats = securityScanner.scanTool( | ||
| toolName, | ||
| `${handler.description}\nRequest payload: ${JSON.stringify(args)}`, | ||
| handler.inputSchema, | ||
| 'mcp-express-server', | ||
| ); | ||
| if (requestThreats.some((threat) => threat.severity === 'critical')) { | ||
| response.status(400).json({ error: 'Security scanner rejected the request', threats: requestThreats }); | ||
| return; | ||
| } | ||
|
|
||
| const decision = await gateway.evaluateToolCall(agentId, toolName, args); | ||
| if (!decision.allowed) { | ||
| response.status(403).json({ | ||
| allowed: false, | ||
| reason: decision.reason, | ||
| findings: decision.findings, | ||
| auditParams: decision.auditParams, | ||
| }); | ||
| return; | ||
| } | ||
|
|
||
| const rawResult = await handler.run(args); | ||
| const scannedResult = responseScanner.scan(rawResult); | ||
| const safeResult = { | ||
| safe: scannedResult.safe, | ||
| blocked: scannedResult.blocked, | ||
| findings: scannedResult.findings, | ||
| sanitized: scannedResult.sanitized, | ||
| }; | ||
| const logEntry = redactor.redact({ agentId, toolName, args, result: safeResult.sanitized }).redacted; | ||
|
|
||
| if (process.env.NODE_ENV !== 'test') { | ||
| console.info('[mcp-express-server]', JSON.stringify(logEntry)); | ||
| } | ||
|
|
||
| response.status(scannedResult.blocked ? 422 : 200).json({ | ||
| allowed: true, | ||
| reason: decision.reason, | ||
| messageVerification: signature, | ||
| response: safeResult, | ||
| auditEntries: auditSink.getEntries().length, | ||
| }); | ||
| }); |
21db11d to
4f4dfb9
Compare
There was a problem hiding this comment.
🤖 AI Agent: code-reviewer
Review Summary for Pull Request: Add MCP Security Documentation and Examples
This pull request adds extensive documentation and examples for the TypeScript MCP security primitives in the AgentMesh SDK. The changes are primarily focused on enhancing the documentation, providing usage examples, and ensuring compliance with OWASP standards.
Actionable Feedback
🔴 CRITICAL Issues
-
Secret Handling: The usage of
process.env.MCP_SESSION_SECRETandprocess.env.MCP_SIGNING_SECRETwithout proper validation or fallback mechanisms can lead to security vulnerabilities. Ensure that these secrets are securely managed and validated before use. Consider implementing a more robust secret management strategy, such as using a dedicated secrets management service. -
In-Memory Stores: The note regarding the in-memory nonce and session stores being intended for single-process development or tests is concerning. In production environments, this can lead to session fixation attacks or replay attacks if not handled properly. Ensure that durable shared storage is implemented for nonce and session management in multi-replica deployments.
🟡 WARNING Issues
-
Breaking Changes in API: The introduction of new classes and methods in the MCP governance package may lead to breaking changes if existing consumers of the SDK are not updated accordingly. Ensure that any breaking changes are clearly documented and consider versioning strategies to maintain backward compatibility.
-
Dependency Changes: The change from
@agentmesh/sdkto@microsoft/agentmesh-sdkin the examples may impact existing users who have not updated their imports. Clearly communicate these changes in the documentation and consider providing migration guides.
💡 SUGGESTION Improvements
-
Documentation Clarity: While the documentation is comprehensive, consider adding more context around the security implications of each MCP primitive. For example, explain how
MCPMessageSignerprotects against replay attacks and whyMCPSecurityScanneris essential for tool call safety. -
Example Coverage: The examples provided are useful, but consider adding more diverse scenarios that demonstrate edge cases or potential misuse of the primitives. This will help users understand the full capabilities and limitations of the MCP security features.
-
Testing Coverage: Ensure that the tests cover a wide range of scenarios, including failure cases and edge cases. Consider adding tests for invalid inputs and security boundary tests to ensure robustness.
-
Type Safety and Pydantic Validation: If not already implemented, ensure that all input schemas are validated using Pydantic or similar libraries to enforce type safety and prevent invalid data from being processed.
-
Rate Limiting Configuration: The rate limiting configuration in
MCPSlidingRateLimiteris hardcoded. Consider allowing users to configure these parameters through environment variables or configuration files for better flexibility.
Conclusion
The documentation and examples added in this pull request significantly enhance the usability of the MCP security primitives. However, attention must be paid to critical security issues and potential breaking changes to ensure a smooth transition for existing users. Addressing the suggestions will further improve the robustness and clarity of the documentation.
There was a problem hiding this comment.
🤖 AI Agent: code-reviewer
Review Summary
This pull request adds extensive documentation and examples for the TypeScript MCP security primitives in the agent-mesh package. The changes include API reference documentation, OWASP compliance mapping, quickstart guides, and a new example Express.js server demonstrating the use of these primitives.
Actionable Feedback
🔴 CRITICAL Issues
- Secret Management: The
loadSecretfunction generates a random secret if the environment variable is not set or is less than 32 bytes. This could lead to security vulnerabilities if the application runs without a proper secret. Ensure that the application fails to start if the required secrets are not provided, rather than generating insecure defaults.
🟡 WARNING Issues
- Public API Changes: The introduction of new classes and methods in the MCP governance package may affect existing consumers of the API. Ensure that the changes are backward compatible or clearly document any breaking changes in the release notes.
💡 SUGGESTION Improvements
-
Documentation Clarity: While the documentation is comprehensive, consider adding more examples that illustrate common pitfalls or security considerations when using the MCP primitives. This could help users avoid misconfigurations that could lead to security vulnerabilities.
-
Error Handling: In the Express server example, consider implementing more robust error handling. For instance, if a tool call fails due to a security scan, it would be beneficial to provide more detailed feedback to the user about what went wrong.
-
Rate Limiting Configuration: The rate limiting configuration is hardcoded in the example server. Consider allowing these configurations to be passed as environment variables or configuration options to make the server more flexible and easier to deploy in different environments.
-
Testing Coverage: Ensure that the tests cover edge cases, especially around security features like session management, signing, and scanning. This will help ensure that the implementation is robust against potential attacks.
-
Concurrency Considerations: If the MCP primitives are expected to be used in a multi-threaded or multi-process environment, ensure that they are thread-safe. Document any concurrency considerations in the usage examples.
-
Performance Considerations: The security scanning and rate limiting features may introduce latency. Consider documenting the expected performance impact and any tuning parameters that can be adjusted based on deployment needs.
Conclusion
The PR is a significant contribution to the documentation and usability of the MCP security primitives. Addressing the critical issues and considering the suggestions will enhance the security and robustness of the library.
There was a problem hiding this comment.
🤖 AI Agent: code-reviewer
Review Summary
This pull request adds extensive documentation and examples for the TypeScript MCP security primitives in the agent-mesh package. The changes include API references, compliance mappings, quickstart guides, and an example Express server demonstrating the use of these primitives. Overall, the documentation appears thorough and well-structured, but there are some areas for improvement and potential concerns.
Actionable Feedback
🔴 CRITICAL Issues
- Secret Handling: The
loadSecretfunction generates a random secret if the environment variable is not set or is less than 32 bytes. This could lead to security issues if the application runs without a proper secret. Ensure that the application is always provided with a secure secret in production environments. Consider implementing a more robust secret management solution.
🟡 WARNING Issues
- Breaking Changes in API: The introduction of new classes and methods in the MCP security primitives could potentially break existing implementations if they rely on previous versions of the SDK. Ensure that the versioning follows semantic versioning principles and clearly document any breaking changes in the release notes.
💡 SUGGESTIONS
-
Documentation Clarity: While the documentation is comprehensive, consider adding more context around the security implications of each MCP primitive. For example, explain how the
MCPGatewaycan be configured for different security policies and what the implications are for tool calls. -
Example Coverage: The example Express server is a great addition, but consider adding more varied examples that cover edge cases, such as handling invalid input, unauthorized access attempts, and rate-limiting scenarios. This will help users understand how to handle real-world situations.
-
Testing Enhancements: The tests provided are a good start, but consider adding more unit tests that cover different scenarios for each tool handler. This will ensure that the implementation is robust and can handle various inputs and edge cases.
-
Rate Limiting Configuration: The rate limiting is currently hardcoded in the example server. Consider allowing users to configure these limits through environment variables or configuration files to make it easier to adapt to different deployment scenarios.
-
Concurrency Considerations: Ensure that the implementation of the MCP primitives is thread-safe, especially since they may be used in concurrent environments. Document any concurrency guarantees or limitations in the documentation.
-
OWASP Compliance Mapping: The OWASP MCP mapping section is a valuable addition. Consider expanding this section to include specific examples of how each primitive addresses common OWASP threats, which would enhance the educational value of the documentation.
Conclusion
The PR adds significant value to the agent-mesh package by enhancing its documentation and providing practical examples. Addressing the critical issue regarding secret management and considering the suggestions for improvement will further strengthen the security and usability of the library.
|
CI Status Summary ✅ All CI checks pass except:
TypeScript MCP docs + Express integration example. Slimmed to 8 files. |
|
Reviewed docs - content looks good. This PR is blocked on its dependency feature PR being merged first. Will merge this once the base PR lands. |
3cd553f to
6de0ee3
Compare
There was a problem hiding this comment.
🤖 AI Agent: code-reviewer
Pull Request Review for docs(typescript): add MCP security documentation and examples
🔴 CRITICAL
-
Hardcoded Secrets in Example Code:
- The
loadSecretfunction inserver.tsgenerates a random secret if the environment variable is missing. While this is acceptable for development, it can lead to insecure practices in production if developers forget to set the environment variables. - Recommendation: Add a warning or error when the secret is missing in production mode (
NODE_ENV=production). For example:if (!secret && process.env.NODE_ENV === 'production') { throw new Error(`Missing required environment variable: ${envName}`); }
- The
-
Insecure Default Behavior:
- The
MCPGatewayallows theread_filetool to execute if theapprovedparameter is set totruein the request body. This could be exploited by an attacker who knows the API and can craft a request withapproved: true. - Recommendation: Avoid relying on user-provided input for sensitive operations. Instead, implement a more robust approval mechanism that cannot be bypassed by modifying the request payload.
- The
-
Potential Path Traversal Vulnerability:
- The
read_filetool does not sanitize or validate thepathparameter beyond checking for specific patterns like../or<system>. This could allow attackers to bypass the check using encoded or obfuscated paths. - Recommendation: Use a library like
pathin Node.js to resolve and validate paths. For example:const resolvedPath = path.resolve('/workspace', path); if (!resolvedPath.startsWith('/workspace')) { throw new Error('Path traversal detected'); }
- The
-
Insecure Default Rate Limiting:
- The
MCPSlidingRateLimiteris configured with a default of 5 requests per minute in the example. This is too permissive for a security-sensitive API. - Recommendation: Use stricter default rate limits in the example, such as 1 request per second (60 requests per minute). Also, document how to configure rate limits for production use.
- The
🟡 WARNING
-
Breaking Changes in Package Name:
- The package name has been changed from
@agentmesh/sdkto@microsoft/agentmesh-sdk. This is a breaking change for existing users. - Recommendation: Clearly document this change in the release notes and provide a migration guide for users to update their imports.
- The package name has been changed from
-
Potential Backward Compatibility Issues:
- The introduction of a new standalone package (
@microsoft/agentmesh-mcp-governance) may cause confusion for existing users of the full SDK. - Recommendation: Clearly document the use cases for the standalone package versus the full SDK to guide users in selecting the appropriate package.
- The introduction of a new standalone package (
💡 SUGGESTIONS
-
Improved Documentation:
- The documentation is comprehensive, but it could benefit from a dedicated section on "Best Practices for Secure Deployment" to guide users on:
- Setting strong secrets in production.
- Configuring rate limits based on expected traffic.
- Implementing durable storage for nonce and session tokens in multi-replica deployments.
- The documentation is comprehensive, but it could benefit from a dedicated section on "Best Practices for Secure Deployment" to guide users on:
-
Enhanced OWASP Mapping:
- The OWASP MCP mapping table is a great addition. Consider adding specific examples or links to the relevant sections in the documentation for each mapping to make it more actionable.
-
Test Coverage:
- The test coverage for the
mcp-express-serverexample is minimal. While it includes basic tests for the/healthand/call-toolendpoints, it does not cover edge cases like:- Missing or invalid session tokens.
- Exceeding rate limits.
- Requests with critical security threats.
- Recommendation: Add tests for these scenarios to ensure the governance pipeline behaves as expected under all conditions.
- The test coverage for the
-
TypeScript Type Safety:
- The
argsparameter in therunmethod ofToolHandleris typed asRecord<string, unknown>. This could lead to runtime errors if the expected structure ofargsis not validated. - Recommendation: Use TypeScript's generics or a library like
zodto enforce stricter type validation forargs.
- The
-
Error Handling:
- The
call-toolendpoint could benefit from more granular error handling. For example, if thesessionAuthenticator.verifyTokenormessageSigner.verifymethods throw an error, the server will crash. - Recommendation: Wrap these calls in
try-catchblocks and return appropriate error responses.
- The
-
Logging:
- The example logs sensitive information (e.g.,
logEntry) to the console. While this is acceptable for development, it should be avoided in production. - Recommendation: Add a note in the documentation to disable logging or use a secure logging mechanism in production.
- The example logs sensitive information (e.g.,
Summary
This PR introduces valuable documentation and examples for MCP security primitives, but there are critical security issues and potential breaking changes that need to be addressed before merging. Once the identified issues are resolved, this PR will significantly enhance the usability and security of the MCP governance features.
Action Items
- Address the critical security issues (secrets, path traversal, approval bypass).
- Document the breaking changes and provide a migration guide.
- Add more comprehensive tests for the example server.
- Improve type safety and error handling in the example code.
- Enhance the documentation with best practices for secure deployment.
Let me know if you need further clarification or assistance!
There was a problem hiding this comment.
🤖 AI Agent: code-reviewer
Review Summary
This pull request introduces comprehensive documentation for the TypeScript MCP (Managed Control Plane) security primitives, including API references, OWASP MCP compliance mapping, quickstart guides, and usage examples. It also includes a new example implementation of an Express.js server that demonstrates the usage of the MCP governance pipeline.
The PR is well-structured and provides clear examples of how to use the MCP security primitives. However, there are a few areas that require attention, especially regarding security, type safety, and potential backward compatibility issues.
🔴 CRITICAL
-
Insecure Default for Secrets in
loadSecretFunction- The
loadSecretfunction generates a random secret if the environment variable is not set or is less than 32 bytes. This is insecure for production environments, as it may lead to inconsistent behavior across multiple instances of the application. - Recommendation: Fail fast if the required environment variable is not set or is invalid. Do not generate a random secret as a fallback for production use. For example:
function loadSecret(envName: string): string { const secret = process.env[envName]; if (!secret || Buffer.byteLength(secret, 'utf-8') < 32) { throw new Error(`Environment variable ${envName} is missing or invalid.`); } return secret; }
- The
-
Potential Path Traversal Vulnerability in
read_fileTool- The
read_filetool does not sanitize or validate thepathparameter beyond checking for specific substrings like'../'. This approach is insufficient to prevent path traversal attacks. - Recommendation: Use a library like
pathin Node.js to resolve and validate paths. For example:const resolvedPath = path.resolve('/workspace', path); if (!resolvedPath.startsWith('/workspace')) { throw new Error('Path traversal detected'); }
- The
-
Insecure Handling of Sensitive Data in Logs
- The
logEntryobject is logged to the console, which may include sensitive data such as redacted secrets. While theCredentialRedactoris used, it is critical to ensure that no sensitive data is inadvertently logged. - Recommendation: Add a configuration option to disable logging of sensitive data in production environments. Additionally, ensure that the
CredentialRedactoris thoroughly tested to handle all edge cases.
- The
-
Replay Protection for Session Tokens
- The
MCPSessionAuthenticatordoes not appear to implement replay protection for session tokens. This could allow an attacker to reuse a stolen token within its TTL. - Recommendation: Implement a mechanism to track and invalidate session tokens after use, such as a token blacklist or a one-time-use token mechanism.
- The
🟡 WARNING
-
Potential Breaking Changes in Package Names
- The package name has been updated from
@agentmesh/sdkto@microsoft/agentmesh-sdk. This is a breaking change for existing users of the library. - Recommendation: Clearly document this breaking change in the release notes and provide migration instructions for existing users.
- The package name has been updated from
-
Backward Compatibility of MCP Governance API
- The introduction of the standalone
@microsoft/agentmesh-mcp-governancepackage may lead to confusion for existing users of the full SDK. It is unclear if the APIs in the standalone package are fully backward-compatible with the full SDK. - Recommendation: Add explicit tests to verify that the MCP governance APIs behave identically in both the full SDK and the standalone package.
- The introduction of the standalone
💡 SUGGESTIONS
-
Improved Documentation for Security Primitives
- The documentation for MCP security primitives is comprehensive but could benefit from additional details about the underlying cryptographic algorithms and their security guarantees.
- Recommendation: Include a section in the documentation that explains the cryptographic algorithms used (e.g., HMAC-SHA256) and their security properties.
-
Thread Safety of In-Memory Stores
- The in-memory nonce and session stores mentioned in the documentation may not be thread-safe in multi-threaded environments.
- Recommendation: Document the thread-safety limitations of the in-memory stores and provide examples of how to implement durable, thread-safe stores for production use.
-
OWASP MCP Compliance Mapping
- The OWASP MCP compliance mapping table in the documentation is helpful but could be expanded to include specific examples of how each primitive addresses the corresponding OWASP MCP category.
- Recommendation: Add a column to the table with links to specific code examples or documentation sections that demonstrate compliance.
-
Type Safety in Tool Handlers
- The
toolHandlersobject usesRecord<string, unknown>for input schemas and arguments, which weakens type safety. - Recommendation: Define TypeScript interfaces for each tool's input schema and arguments to improve type safety and developer experience.
- The
-
Test Coverage for Edge Cases
- The provided tests cover basic functionality but do not test edge cases such as invalid session tokens, malformed tool calls, or high request rates.
- Recommendation: Add tests for edge cases to ensure robust behavior under adverse conditions.
Conclusion
This PR introduces valuable documentation and examples for the MCP security primitives, but there are critical security issues and potential breaking changes that need to be addressed before merging. Additionally, there are opportunities to improve type safety, documentation, and test coverage.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ple server The route already uses MCPSlidingRateLimiter.consume() for per-agent per-tool rate limiting. CodeQL doesn't recognize custom rate limiters. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
d932fa1 to
4097f89
Compare
There was a problem hiding this comment.
🤖 AI Agent: code-reviewer
Review Summary
This pull request introduces comprehensive documentation for TypeScript MCP (Managed Control Plane) security primitives, including API references, usage examples, and an Express.js server example. The documentation is well-structured and provides clear guidance on the usage of the MCP governance features. The PR also includes a new example project (mcp-express-server) that demonstrates the integration of MCP governance primitives in a real-world scenario.
Below is the detailed review of the pull request:
🔴 CRITICAL
-
Session Token Secret Handling
- File:
server.ts - Issue: The
loadSecretfunction generates a random secret if theMCP_SESSION_SECRETenvironment variable is not set. This could lead to inconsistent behavior in multi-instance deployments or after a server restart, as the secret would change and invalidate all existing session tokens. - Recommendation: Ensure that the
MCP_SESSION_SECRETis always explicitly set in the environment. If it's not set, the server should fail to start with a clear error message.
- File:
-
Approval Handler Logic
- File:
server.ts - Issue: The
approvalHandlerin theMCPGatewayallows theread_filetool if theapprovedparameter is set totrue. This could be exploited by an attacker to bypass the approval process by including"approved": truein the request payload. - Recommendation: Do not rely on user-provided input (
params.approved) for critical security decisions. Instead, implement a robust approval mechanism that is independent of user input.
- File:
-
Path Traversal Protection
- File:
server.ts - Issue: The
blockedPatternsin theMCPGatewayconfiguration include patterns like'../'and'..\\'to prevent path traversal. However, these patterns may not cover all possible encodings or bypass techniques. - Recommendation: Use a robust library or framework for path sanitization instead of relying on pattern matching. For example, resolve the path to an absolute path and ensure it is within an allowed directory.
- File:
🟡 WARNING
-
Breaking Changes in Package Name
- File:
README.md - Issue: The package name has been changed from
@agentmesh/sdkto@microsoft/agentmesh-sdk. This is a breaking change for existing users of the package. - Recommendation: Clearly communicate this breaking change in the release notes and provide migration instructions for users to update their imports.
- File:
-
Backward Compatibility for MCP Governance
- File:
README.md - Issue: The introduction of a standalone
@microsoft/agentmesh-mcp-governancepackage may cause confusion for users already using the full SDK. It is unclear if the standalone package is fully backward-compatible with the MCP features in the full SDK. - Recommendation: Clearly document the differences (if any) between the standalone package and the full SDK. Ensure that the standalone package is fully compatible with existing MCP governance features.
- File:
💡 SUGGESTIONS
-
Improved Documentation for Security Features
- File:
README.md - Suggestion: Add a section explaining the security implications of each MCP governance primitive (e.g., how
MCPSessionAuthenticatorprevents session hijacking, howMCPMessageSignerensures message integrity, etc.). This will help users understand the importance of these features.
- File:
-
Enhanced Test Coverage
- File:
server.test.ts - Suggestion: Add more test cases to cover edge scenarios, such as:
- Invalid or expired session tokens.
- Requests that trigger the
blockedPatternsinMCPGateway. - Requests with invalid or malicious input schemas.
- Requests that exceed the rate limit.
- File:
-
Environment Variable Validation
- File:
server.ts - Suggestion: Add validation for critical environment variables (
MCP_SESSION_SECRET,MCP_SIGNING_SECRET) during server startup. If these variables are missing or invalid, the server should fail to start with a clear error message.
- File:
-
Nonce Replay Protection
- File:
server.ts - Suggestion: The
MCPMessageSignerincludes nonce replay protection, but the documentation does not explain how to configure or use a shared nonce store in multi-instance deployments. Add guidance on implementing a shared nonce store for production environments.
- File:
-
Rate Limiting Transparency
- File:
server.ts - Suggestion: Include rate-limiting headers (e.g.,
X-RateLimit-Limit,X-RateLimit-Remaining,X-RateLimit-Reset) in the response to help clients understand their rate limit status.
- File:
-
Logging Improvements
- File:
server.ts - Suggestion: Use a structured logging library (e.g.,
pinoorwinston) instead ofconsole.infofor better log management and integration with log aggregation systems.
- File:
Summary of Actions Required
- Address the critical issues related to session token secret handling, approval handler logic, and path traversal protection.
- Clearly communicate the breaking changes in the package name and ensure backward compatibility for MCP governance features.
- Enhance documentation, test coverage, and logging to improve usability and maintainability.
- Add validation for critical environment variables and provide guidance on configuring shared stores for production deployments.
Let me know if you have any questions or need further clarification!
|
Closing — this PR depends on #791 and #827 (both closed) as prerequisites. With the MCP architecture consolidation, TypeScript documentation will follow the hybrid approach from PR #910. Thank you for the docs work, @jackbatzner. |
Description
Adds comprehensive documentation for TypeScript MCP security primitives, including API reference docs, OWASP MCP Cheat Sheet compliance mapping, quickstart guides, i18n translations, and usage examples.
Type of Change
Package(s) Affected
Checklist
Related Issues
Supersedes #791 (split for easier review). Merge after #827.