Email rackctl@gmail.com with subject [security][competitive-intelligence]. Do not open public issues for security reports.
Acknowledgement target: within 72 hours. Triage target: within 5 business days.
competitive-intelligence crawls external competitor websites and feeds the content through an LLM, so its defining controls are about what the crawler is allowed to reach and never holding a credential in the app.
- Source URLs are operator-supplied (
sources.json), so every outbound crawl URL passes throughguardUrl(src/crawler/url-guard.ts) before the fetch. It rejects non-http(s)schemes and any host that resolves to loopback, RFC1918 (10/8,172.16/12,192.168/16), link-local (169.254/16— including the169.254.169.254cloud-metadata address), IPv6 loopback/link-local/unique-local, or the unspecified address. - The hostname is resolved once and the guard gates on the resolved address. DNS-rebinding (the fetch re-resolving to a different address) is a known residual — acceptable for crawling external marketing pages, and defense-in-depth is provided by the NetworkPolicy below.
- At the network layer, the chart's default-deny
NetworkPolicyegress allow-list blocks169.254.0.0/16(IMDS) outright, so even a guard bypass cannot reach instance metadata.
- No long-lived credentials in the app. Pods get AWS access via IRSA (Workload Identity) — the Bedrock LLM + Titan embeddings run on the AWS credential chain, which resolves to the landing-zone
competitive-intelligence-platformrole. There are no static keys anywhere in the repo or image;config.tsreads noAWS_ACCESS_KEY_ID. - App-level secrets are projected at deploy time by External Secrets Operator from AWS Secrets Manager (
competitive-intelligence/<env>/*) into a Kubernetes Secret, consumed viaenvFrom— never committed. Slack tokens and optional Anthropic/OpenAI keys come fromcompetitive-intelligence/<env>/app-secrets; Postgres credentials come from the Aurora-managedcompetitive-intelligence/<env>/db-credentials. - The optional Anthropic/OpenAI direct-API providers are off by default. Bedrock (on-account, IRSA, no key) is the default for both LLM and embeddings.
- Inference runs on-account via Amazon Bedrock — crawled competitor content and intel queries are not sent to a third party. (The direct Anthropic/OpenAI providers are the exception and are opt-in.)
- Embeddings and chunk content are stored at rest in Aurora PostgreSQL (pgvector), encrypted at rest by the engine. The crawl only collects publicly served competitor web pages; it stores no end-user PII.
- Default-deny
NetworkPolicywith an explicit egress allow-list: DNS (kube-dns :53), HTTPS :443 to0.0.0.0/0except169.254.0.0/16(crawl targets + Bedrock + Slack, IMDS blocked), and Postgres :5432 to the cluster VPC CIDR. - No public ingress. There is no inbound HTTP product surface — the Slack integration runs in Socket Mode (an outbound WebSocket), and the only inbound traffic allowed is same-namespace health probes against
/health+/readyz.
- DNS rebinding. The URL guard resolves and gates once; a host that re-resolves after the check could in theory be re-pointed. The IMDS-blocking NetworkPolicy egress rule is the backstop for the highest-value target.
- Source trust.
sources.jsonis operator-controlled and Zod-validated, but the guard's protection is only as good as keeping that file under review — a malicious source entry is mitigated by the network egress rules, not eliminated. - Single-writer assumption. The crawl mutex is in-process; correctness depends on
replicaCount: 1. Running multiple writers without leader election is a correctness risk (double-crawl / racing diff-replace), not just a performance one.
competitive-intelligence exposes the controls needed for SOC 2 Type II — IRSA-only access with no static credentials, secrets sourced from Secrets Manager via External Secrets (never committed), encrypted-at-rest data in Aurora, a default-deny network posture with IMDS blocked, and on-account inference with no third-party data egress on the default path. Substrate-level controls (CIS EKS baseline, Pod Security Standards, image signing) are enforced upstream by landing-zone and eks-gitops.