Khalifa

Agentless ingestion of AWS Org resources and Security Hub findings into a Neptune-backed security graph, with a risk and attack-path engine to identify security issues.

Docs · Operations

Architecture

See ARCHITECTURE.md for the full system diagram and design.

1. What is Khalifa?

Khalifa is an agentless ingestion pipeline that sits between your AWS Organization and a Neptune-backed security graph. Every account, resource, and finding gets collected on a schedule, normalized into a graph model you own, and evaluated locally against risk rules, attack-path traversals, and CIEM effective-permission logic.

Why it was built: Cloud estates grow faster than any team can review them. A misconfigured S3 bucket, an over-privileged IAM role, or a publicly exposed RDS instance turns into a real finding before anyone notices. Khalifa gives those resources a graph: collected, scored, joined to attack paths, and rendered as issues you can act on.

How it works: Collectors run on an EventBridge schedule (or as Kubernetes CronJobs) and assume into every account in the AWS Organization via a cross-account role. They inventory 30+ AWS services, decompose IAM into policy statements + effective permissions, pull Security Hub and GuardDuty findings, and stream everything into Neptune. The Risk Engine then runs Gremlin traversals against the live graph to produce prioritized issues, attack paths, and compliance evaluations against CIS, SOC 2, and ISO 27001 — without ever moving data out of your AWS account.

2. Run your security pipeline with Khalifa

Install

Prerequisites: Node.js 20+, AWS CDK CLI, an AWS Organization with a delegated admin account, and a Neptune cluster reachable from your compute.

git clone https://github.com/therandomsecurityguy/khalifa
cd khalifa
npm ci --workspaces

Deploy the cross-account collector role into every member account from the templates/SecurityGraphCollectorRole.yaml template, then bootstrap the ingestion stack.

Quickstart

There are two ways to run Khalifa. Both end up in the same place (a populated security graph with risk findings) but the first is faster to try, the second is the production setup.

Option A: Lambda + EventBridge (development / small scale)

The Lambda stack uses EventBridge schedules, Step Functions for parallel account fan-out, and a separate daily CloudTrail analyzer. It scales to roughly 20 accounts without tuning.

cd cdk
npm install
npm run build

cdk deploy KhalifaStack \
  --neptune-endpoint neptune-cluster.us-east-1.amazonaws.com \
  --issues-table-name SecurityIssues \
  --access-analyzer-table AccessAnalyzerCache \
  --athena-database khalifa_cloudtrail_db \
  --cloudtrail-s3-location s3://cloudtrail-logs/AWSLogs/

Every two hours the collector ingests all member accounts. CloudTrail analysis runs daily at 02:00 UTC. The policy evaluator runs every six hours and after each collector pass. The risk engine runs hourly.

Option B: EKS + CronJob (production / >20 accounts)

The EKS stack runs the API service, rule runner, and UI as Kubernetes workloads. It is built for sustained load across hundreds of accounts and gives you a UI plus a REST API.

# 1. Build and push images
cd api-service && docker build -t security-graph-api:latest . && docker push <ecr>/security-graph-api:v1.0.0
cd ../packages/risk-engine && docker build -t security-graph-rule-runner:latest . && docker push <ecr>/security-graph-rule-runner:v1.0.0

# 2. Deploy CDK
cd ../../cdk
cdk deploy SecurityGraphEksStack \
  --vpc-id vpc-12345678 \
  --neptune-endpoint neptune-cluster.us-east-1.amazonaws.com \
  --issues-table-name SecurityIssues \
  --certificate-arn arn:aws:acm:us-east-1:... \
  --cognito-user-pool-id us-east-1_xxxxx \
  --cognito-client-id xxxxx

# 3. Apply manifests
kubectl apply -f eks-manifests/

Use this when you want to serve a UI to analysts, expose a stable REST API, or run the rule runner on a schedule that survives control-plane hiccups.

Two deployment options

Approach	Use Case	Complexity
Lambda + EventBridge	Development / small scale (<20 accounts)	Lower
EKS + CronJob	Production (>20 accounts, multi-tenant API)	Higher

Configuration

Environment variables are shared across both stacks. The CDK stack wires sane defaults; override at deploy time or via the Kubernetes ConfigMap (eks-manifests/01-configmap.yaml).

Variable	Description	Default
`NEPTUNE_ENDPOINT`	Neptune cluster endpoint	—
`NEPTUNE_AUTH_SECRET_ARN`	Secrets Manager ARN for Neptune auth	—
`ISSUES_TABLE`	DynamoDB table for issues	`SecurityIssues`
`ACCESS_ANALYZER_TABLE`	DynamoDB table for CloudTrail usage cache	`AccessAnalyzerCache`
`ATHENA_DATABASE`	Glue database for CloudTrail logs	`khalifa_cloudtrail_db`
`ATHENA_WORKGROUP`	Athena workgroup	`khalifa-cloudtrail-analysis`
`CLOUDTRAIL_S3_LOCATION`	S3 prefix for CloudTrail logs	`s3://cloudtrail-logs/AWSLogs/`
`ANALYSIS_DAYS`	CloudTrail lookback window	`90`
`AWS_REGION`	AWS region	`us-east-1`
`LOG_LEVEL`	Logging level	`info`
`RULE_RUNNER_SCHEDULE`	Cron schedule	`0 /6 * *` (every 6h)

Cross-account access is granted via the IAM role defined in templates/SecurityGraphCollectorRole.yaml. Deploy it once per member account with a unique external ID per deployment.

API reference

All routes except /health require a valid Cognito bearer JWT. RBAC roles are mapped from Cognito groups: khalifa-admin → Admin, khalifa-analyst → Analyst, khalifa-viewer → Viewer.

Issues & risk

Endpoint	Description
`GET /health`	Health check (unauthenticated)
`GET /issues`	List issues with filters (Viewer+)
`GET /issues/:id`	Get issue details with attack path (Viewer+)
`GET /issues/counts`	Get issue counts by severity (Viewer+)
`GET /issues/stats`	Get detailed statistics (Viewer+)
`GET /attack-paths?fromSelector=X&toSelector=Y`	Find attack paths (Viewer+)
`GET /resources/:arn`	Get resource with neighbors and issues (Viewer+)
`GET /resources/search?label=EC2Instance`	Search resources (Viewer+)

Compliance

Endpoint	Description
`GET /compliance/frameworks`	List available compliance frameworks (Viewer+)
`GET /compliance/frameworks/:framework`	Get framework overview with control summaries (Viewer+)
`GET /compliance/frameworks/:framework/controls`	List all controls for a framework (Viewer+)
`GET /compliance/frameworks/:framework/controls/:controlId`	Get control details with evidence (Viewer+)
`GET /compliance/frameworks/:framework/report`	Generate compliance report (Viewer+)
`GET /compliance/frameworks/:framework/drift`	Detect configuration drift since last evaluation (Viewer+)

CIEM / Identity

Endpoint	Description
`GET /identity/effective-permissions/:principal`	Get computed effective permissions for a principal
`GET /identity/escalation-paths`	List detected escalation paths with filters
`GET /identity/unused-permissions?principal=X&days=90`	Find unused permissions by comparing effective perms vs CloudTrail usage
`GET /identity/rightsizing/:principal?safetyMarginDays=7`	Generate least-privilege policy recommendation
`GET /identity/trust-graph?account=X`	Retrieve cross-account trust relationships as a graph

Query parameters for /issues

Parameter	Type	Description
`severity`	string[]	Filter by severity (critical, high, medium, low)
`team`	string[]	Filter by owning team
`status`	string[]	Filter by status (open, resolved, suppressed)
`ruleId`	string	Filter by rule ID
`limit`	number	Max results (default: 50, max: 1000)
`nextToken`	string	Pagination token

Examples

# Get critical issues
curl "https://api.example.com/issues?severity=critical&status=open&limit=100" \
  -H "Authorization: Bearer $TOKEN"

# Find attack paths
curl "https://api.example.com/attack-paths?fromSelector=Internet&toSelector=S3Bucket&maxPathLength=4" \
  -H "Authorization: Bearer $TOKEN"

# Get CIS compliance report
curl "https://api.example.com/compliance/CIS_AWS_FOUNDATIONS/report" \
  -H "Authorization: Bearer $TOKEN"

# Get effective permissions for a role
curl "https://api.example.com/identity/effective-permissions/arn:aws:iam::123456:role/AdminRole" \
  -H "Authorization: Bearer $TOKEN"

# Get rightsizing recommendation
curl "https://api.example.com/identity/rightsizing/arn:aws:iam::123456:role/DataRole?safetyMarginDays=7&includeReadonlySafe=true" \
  -H "Authorization: Bearer $TOKEN"

Different operating models

1. Single-account dev (like Quickstart Option A)

The Lambda stack fans out from a single delegated admin account, assumes into each member account via the collector role, and writes directly to a Neptune cluster in the same VPC. Step Functions parallelize the per-account work.

cdk deploy KhalifaStack

2. Multi-account org with EKS backend (like Quickstart Option B)

The EKS stack adds an API service, a UI, and a Kubernetes CronJob for the rule runner. The API is fronted by an ALB with Cognito OIDC, and the rule runner executes Gremlin traversals on the same Neptune cluster.

cdk deploy SecurityGraphEksStack
kubectl apply -f eks-manifests/

3. Multi-account org with read replicas

Run collectors in each region and replicate into a single Neptune cluster via Neptune Streams. Use this when accounts are concentrated in specific regions or you need to keep data residency boundaries.

Full deployment reference: ARCHITECTURE.md · OPERATIONAL.md

Compliance frameworks

Khalifa includes automated compliance evaluation against three industry-standard frameworks with 124 controls and 40+ automated evaluators that run Gremlin graph queries against your security data.

CIS AWS Foundations Benchmark v3.0 (78 controls)

Covers the foundational security configurations for AWS accounts:

Section	Controls	Focus
1. IAM	18	Root account, MFA, password policy, access keys
2. Logging	10	CloudTrail, Config, S3 logging
3. Monitoring	28	CloudWatch alarms, GuardDuty, Config rules
4. Networking	12	VPC flow logs, security groups, NACLs
5. Data Protection	10	Encryption, KMS rotation, backup

SOC 2 Type II (22 controls)

Maps to Trust Services Criteria:

Section	Controls	Focus
CC6	8	Logical access, authentication, authorization
CC7	6	Monitoring, incident response, change management
CC8	4	Risk mitigation, system boundaries
CC9	4	Additional criteria

ISO 27001:2022 (24 controls)

Based on Annex A controls:

Section	Controls	Focus
A.5	4	Organizational controls, policies
A.6	4	People controls, onboarding, training
A.8	6	Technological controls, encryption, logging
A.9	4	Physical and environmental security
A.12	3	Operations security, vulnerability management
A.13	3	Communications security, network controls

The compliance engine runs Gremlin evaluators that query the live graph, produce per-control evidence (pass/fail/manual), and write results to DynamoDB for the UI to render.

3. Architecture

Khalifa flow diagram

Collector: runs in a delegated admin account, assumes into every member account via the cross-account role, and inventories 30+ AWS services per pass. Writes raw resource nodes to Neptune.

Policy Evaluator: resolves IAM identity + resource + boundary + SCP policies into net effective permissions per principal, and traverses cross-account trust edges up to 3 hops to surface escalation paths.

CloudTrail Analyzer: runs Athena queries against the CloudTrail S3 logs on a daily schedule, with a 90-day lookback window. Writes usage data to the AccessAnalyzerCache DynamoDB table for the rightsizer.

Risk Engine: runs Gremlin traversals against the live graph on a schedule, producing prioritized issues, attack paths, and compliance evaluations. Each rule ships with severity, scoring, and remediation guidance.

API Service: REST API fronted by ALB + Cognito OIDC. Exposes issues, attack paths, resources, compliance reports, and CIEM identity endpoints. RBAC enforced from Cognito groups.

UI: Next.js dashboard for issues, attack paths, and compliance. Renders control-level evidence, drift detection, and CSV export.

Features

Agentless ingestion: no agents to install in member accounts; collectors assume via a single cross-account role defined in templates/SecurityGraphCollectorRole.yaml
Deterministic graph model: every resource, IAM statement, and finding becomes a typed node with explicit edges; Gremlin returns the same traversal for the same input every time
30+ AWS services collected: compute, storage, database, identity, network, security, logging, serverless, secrets, and backup
Risk + attack path + CIEM in one pass: rules, traversals, and effective-permission evaluation all run against the same live graph
CIEM with CloudTrail grounding: effective permissions are joined to actual usage from Athena over CloudTrail, with rightsizing recommendations and a configurable safety margin
Compliance built in: CIS v3.0, SOC 2 Type II, and ISO 27001:2022 evaluated by automated Gremlin queries with per-control evidence
Two deployment modes: Lambda + EventBridge for development, EKS + CronJob for production — same data model, same graph, same API

4. Repo structure

Infrastructure

cdk/

CDK stacks: KhalifaStack (Lambda + EventBridge) and SecurityGraphEksStack (EKS + ALB + Cognito)

templates/

Cross-account IAM role template deployed once per member account

Collectors

lambdas/list-accounts

Lists org accounts from AWS Organizations

lambdas/collector

Per-account collector: 30+ AWS services + enhanced IAM decomposition

lambdas/graph-writer

Neptune writer for raw resource nodes

lambdas/incremental-collector

Event-driven updates via EventBridge → SQS

lambdas/policy-evaluator

CIEM engine: effective permissions, escalation paths, rightsizing

lambdas/cloudtrail-analyzer

Athena queries over CloudTrail S3 logs → DynamoDB cache

Engine

packages/risk-engine

Risk rules, attack-path traversals, scoring, compliance evaluators

Service

api-service/

REST API (Express) — issues, attack paths, resources, compliance, identity

ui/

Next.js dashboard — issues, attack paths, compliance

Deploy

eks-manifests/

Kubernetes manifests for API service, rule runner CronJob, HPA, NetworkPolicy

Docs

ARCHITECTURE.md

System architecture, data model, ingestion topology

OPERATIONAL.md

Runbooks for the rule runner, Neptune, IRSA, and incident response

CONTRIBUTING.md

Local development, workspaces, CI conventions

CHANGELOG.md

Release history

License

BSD 3-Clause. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 68 Commits
.github/workflows		.github/workflows
api-service		api-service
cdk		cdk
eks-manifests		eks-manifests
lambdas		lambdas
packages/risk-engine		packages/risk-engine
templates		templates
ui		ui
.eslintrc.js		.eslintrc.js
.gitignore		.gitignore
.prettierignore		.prettierignore
.prettierrc		.prettierrc
ARCHITECTURE.md		ARCHITECTURE.md
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
OPERATIONAL.md		OPERATIONAL.md
README.md		README.md
jest.config.js		jest.config.js
jest.setup.js		jest.setup.js
package-lock.json		package-lock.json
package.json		package.json
prettier.config.js		prettier.config.js
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Khalifa

Architecture

1. What is Khalifa?

2. Run your security pipeline with Khalifa

Install

Quickstart

Two deployment options

Configuration

API reference

Different operating models

Compliance frameworks

3. Architecture

Features

4. Repo structure

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Khalifa

Architecture

1. What is Khalifa?

2. Run your security pipeline with Khalifa

Install

Quickstart

Two deployment options

Configuration

API reference

Different operating models

Compliance frameworks

3. Architecture

Features

4. Repo structure

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages