Skip to content

Add Tier-2 policy-driven DetectionEngine with multi-turn, agent-chain, response scanning and feedback endpoint#65

Open
S3DFX-CYBER wants to merge 5 commits into
mainfrom
codex/upgrade-ml-model-to-fine-tuned-bert
Open

Add Tier-2 policy-driven DetectionEngine with multi-turn, agent-chain, response scanning and feedback endpoint#65
S3DFX-CYBER wants to merge 5 commits into
mainfrom
codex/upgrade-ml-model-to-fine-tuned-bert

Conversation

@S3DFX-CYBER
Copy link
Copy Markdown
Collaborator

Motivation

  • Provide a Tier-2 detection engine to support more robust adversarial prompt detection across turns, agent hops, and outputs.
  • Enable configurable, per-organization policy rules and thresholds via YAML to allow custom rules and risk tuning.
  • Add analyst feedback capture to close the FP/FN retraining loop and persist corrective labels for model improvements.

Description

  • Introduced services/analyzer/detection_engine.py implementing DetectionEngine that supports policy-driven pattern scoring, session-level risk accumulation, agentic chain/tool-call monitoring, response/data-leak scanning, optional transformer backend scoring (bert/deberta via TRANSFORMER_MODEL_ID) and a sklearn fallback interface.
  • Added a default policy at services/analyzer/policy/default_policy.yaml with thresholds, prompt rules, agentic-chain patterns, response leak patterns, and org-specific overrides.
  • Integrated the engine into the analyzer service by wiring DetectionEngine into services/analyzer/app.py, extending the /v1/analyze request schema with session_id, organization_id, response_text, and agent_trace, and returning normalized AnalysisResponse.
  • Added /v1/feedback endpoint and DetectionEngine.append_feedback which persists analyst FP/FN corrections to a JSONL feedback sink for retraining.
  • Added unit tests in tests/unit/test_detection_engine.py covering response scanning, multi-turn session tracking, and feedback persistence.
  • Updated runtime dependencies in requirements.txt to include transformers and PyYAML for transformer backends and YAML policy parsing.

Testing

  • Ran pytest tests/unit/test_analyzer.py tests/unit/test_detection_engine.py -q, and all tests passed with warnings: 31 passed, 4 warnings.
  • New unit tests tests/unit/test_detection_engine.py exercised response scanning, session escalation, and feedback write behavior and succeeded.

Codex Task

@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 9, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
tenet-ai.org Ready Ready Preview, Comment Apr 19, 2026 10:31am

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: da8fdfd12e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

return self._score_patterns(response_text, patterns)

def _session_key(self, organization_id: Optional[str], session_id: Optional[str]) -> str:
return f"{organization_id or 'default'}:{session_id or 'anon'}"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Separate anonymous requests from shared session accumulator

session_id is optional for analysis requests, but _session_key maps every missing session_id to default:anon, so unrelated traffic shares one risk history. After a handful of high-risk prompts, benign prompts from other users can be marked suspicious/malicious purely because session risk is reused, which creates cross-user contamination whenever clients omit session_id.

Useful? React with 👍 / 👎.

logger.info("Using sklearn backend")
return

model_id = os.getenv("TRANSFORMER_MODEL_ID", "distilbert-base-uncased-finetuned-sst-2-english")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Stop defaulting transformer scoring to sentiment model

The default TRANSFORMER_MODEL_ID is distilbert-base-uncased-finetuned-sst-2-english (sentiment), while the engine uses this backend by default; combined with the maliciousness mapping in _model_score, benign but negatively worded prompts can be scored as attacks. Because final risk_score takes the max signal, this default can systematically skew verdicts toward false positives unless every deployment overrides the model ID.

Useful? React with 👍 / 👎.

@github-actions
Copy link
Copy Markdown

🤖 TENET Agent Review

📋 Summary

This pull request introduces a significant upgrade to the TENET AI project by implementing a Tier-2 DetectionEngine. This engine provides policy-driven, multi-turn, agent-chain, and response-scanning capabilities for more robust adversarial prompt detection. It also adds a feedback loop for model retraining, integrates Prometheus/Grafana for enhanced monitoring, and includes Helm charts for air-gapped deployments, substantially improving the system's security posture and operational reliability.

🔐 Security Findings

  • [MEDIUM] helm/tenet/values.yaml - The default apiKey: tenet-dev-key-change-in-production in the Helm chart's values.yaml provides a known default secret. While the name indicates it should be changed, this can lead to insecure deployments if not explicitly overridden by users. It's recommended to either leave the API key empty, requiring explicit user input, or generate a random one during deployment.
  • [LOW] services/analyzer/siem_connectors.py - The SIEMDispatcher sends potentially sensitive event data (including prompt, response_text, agent_trace) to external SIEM systems. Users should be explicitly aware of the data being shared and ensure the security of their configured SIEM endpoints and the environment variables (SPLUNK_HEC_URL, SPLUNK_HEC_TOKEN, etc.) that control these connections to prevent SSRF or data exfiltration.
  • [LOW] services/analyzer/detection_engine.py - The _load_policy, _load_collective_threat_intel, and _model_score functions rely on environment variables (POLICY_PATH, THREAT_INTEL_PATH, TRANSFORMER_MODEL_ID) to determine file paths and model IDs. If these environment variables are not securely controlled, an attacker could potentially inject malicious policy/threat intel files or load arbitrary, potentially malicious, transformer models.

🧹 Code Quality

  • services/analyzer/detection_engine.py - The _session_cache is an in-memory dictionary. This means that session state for multi-turn risk accumulation is not shared across multiple instances of the analyzer service, limiting horizontal scalability for stateful detection. Consider using Redis for session state persistence if horizontal scaling is a requirement.
  • services/analyzer/detection_engine.py - The _merged_policy function performs a shallow merge for organization-specific policy overrides. If nested dictionary keys (e.g., rules.prompt_patterns) are intended to be merged rather than fully overwritten by organization-specific policies, the current logic might not behave as expected.
  • services/analyzer/detection_engine.py - In _owasp_layer, the coverage_score calculation relies on len(set(mapping.values())) for total_categories. This implicitly assumes all relevant OWASP categories are represented in the policy's pattern-to-category mapping. If some categories are missing from the mapping, the coverage_score might be inaccurately high.
  • services/analyzer/app.py - The engine, siem_dispatcher, and circuit_breaker are initialized as global variables in startup(). While functional, using FastAPI's dependency injection system could improve testability and modularity.
  • helm/tenet/values.yaml - The use of :local image tags (e.g., tenet-ingest:local, tenet-analyzer:local) in the Helm chart's values.yaml can lead to ambiguity and versioning issues. Using specific, immutable image tags (e.g., v0.1.0) is generally recommended for production deployments, even for air-gapped scenarios.

✅ What's Done Well

  • Comprehensive Detection Capabilities: The DetectionEngine significantly enhances TENET's ability to detect complex attacks through policy-driven rules, multi-turn session tracking, agent chain monitoring, and response scanning.
  • Robust Observability and Reliability: The integration of Prometheus/Grafana, OpenTelemetry, and a circuit breaker demonstrates a strong commitment to operational stability, monitoring, and graceful degradation, which are crucial for a defensive security system.
  • Extensibility and Configurability: The policy-driven approach via YAML, organization-specific overrides, and the feedback loop for retraining provide excellent extensibility and allow for fine-tuning the detection logic to specific organizational needs and evolving threats.

📝 Overall Verdict

[REQUEST CHANGES] - Address the identified security and code quality concerns, particularly regarding default API keys, SIEM data handling, and session state management.

@github-actions
Copy link
Copy Markdown

🤖 TENET Agent Review

📋 Summary

This pull request introduces a significant upgrade to the TENET AI analyzer service by implementing a Tier-2 DetectionEngine. This engine provides policy-driven, multi-turn, agent-chain, and response scanning capabilities, enhancing adversarial prompt detection. It also adds a feedback endpoint for retraining, integrates Prometheus metrics and OpenTelemetry tracing for observability, and includes Helm charts for air-gapped deployments. The approach is sound, significantly improving the system's defensive posture and operational robustness.

🔐 Security Findings

  • [LOW] services/analyzer/app.py - The organization_id field in AnalysisRequest is user-controlled and used to select organization-specific policy overrides. While security.require_auth is called, it's not explicitly clear if the organization_id in the request is validated against the authenticated user's allowed organizations. An unauthorized user potentially specifying a different organization_id could lead to policy bypass or weakening if not properly enforced by the SecurityManager.
  • [LOW] helm/tenet/values.yaml - The default apiKey: tenet-dev-key-change-in-production in values.yaml is a development placeholder. While the comment explicitly warns about changing it, it's a common oversight in production deployments. Consider adding a pre-deployment check or a more robust secret management integration for production environments.
  • [LOW] docker-compose.yml - Grafana's default GF_SECURITY_ADMIN_USER=admin and GF_SECURITY_ADMIN_PASSWORD=admin are suitable for local development but pose a security risk if deployed to production without modification. This should be explicitly highlighted for production deployments.

🧹 Code Quality

  • services/analyzer/detection_engine.py - The _session_cache is an in-memory dictionary. For a production-grade, horizontally scalable service, this session state should ideally be persisted in a shared, external store like Redis to ensure consistency across multiple analyzer instances.
  • services/analyzer/app.py - The _null_context class is a simple no-op context manager. While functional, a more idiomatic approach for optional tracing might be to use contextlib.nullcontext from the standard library, which serves the same purpose.

✅ What's Done Well

  • Comprehensive Detection Logic: The DetectionEngine is well-designed, integrating multiple detection layers (pattern matching, collective threat intel, agent chain monitoring, response scanning, ML models, and session-level risk accumulation) into a cohesive policy-driven system.
  • Robust Observability: The inclusion of Prometheus metrics, Grafana integration, and OpenTelemetry tracing significantly enhances the system's monitoring and debugging capabilities, which is crucial for a security middleware.
  • Reliability and Graceful Degradation: The implementation of a CircuitBreaker ensures that the analyzer service can gracefully degrade and fall back to a "benign" verdict if the detection engine encounters failures, preventing service outages.
  • Extensive Testing: The addition of new unit tests for the DetectionEngine and CircuitBreaker, along with an integration test for latency SLA, demonstrates a strong commitment to code quality and reliability.

📝 Overall Verdict

[REQUEST CHANGES] - Address the organization_id validation and consider externalizing session state for scalability.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant