Skip to content

feat: Add AWS CloudTrail Ingestion Source (#547)#561

Open
R1sh0bh-1 wants to merge 3 commits intocertego:developfrom
R1sh0bh-1:feat/cloudtrail-ingestion
Open

feat: Add AWS CloudTrail Ingestion Source (#547)#561
R1sh0bh-1 wants to merge 3 commits intocertego:developfrom
R1sh0bh-1:feat/cloudtrail-ingestion

Conversation

@R1sh0bh-1
Copy link
Contributor

feat: Add AWS CloudTrail Ingestion Source (#547)

Summary

This PR implements ingestion support for AWS CloudTrail logs stored in S3, enabling detection of impossible travel and anomalous logins in AWS environments. It processes successful events like ConsoleLogin and AssumeRole, extracts usernames from direct userName and sessionContext.sessionIssuer, enriches with GeoIP, and normalizes fields for the existing pipeline.

Closes #547.

Key Changes

  • New CloudTrailIngestion class (buffalogs/impossible_travel/ingestion/cloudtrail_ingestion.py):
    • Day-by-day S3 key listing with pagination (get_log_files).
    • Gzipped JSON parsing and successful event filtering (extract_logins, is_login_event).
    • Username extraction supporting IAM users and assumed roles (parse_login).
    • GeoIP enrichment for country/lat/lon (skips internal/local IPs).
    • Field normalization compatible with BuffaLogs format.
  • Factory & Base Updates:
    • Added CLOUDTRAIL to SupportedIngestionSources enum (base_ingestion.py).
    • Updated IngestionFactory to instantiate CloudTrailIngestion (ingestion_factory.py).
  • Configuration:
    • Added "cloudtrail" section to config/buffalogs/ingestion.json with placeholders for bucket, region, credentials, and GeoLite2 path.
  • Dependencies:
    • Added boto3>=1.35.0 and geoip2>=4.8.0 to django-buffalogs/setup.cfg.
  • Testing:
    • Unit tests (buffalogs/impossible_travel/tests/ingestion/test_cloudtrail_ingestion.py): 4 tests passing for S3 pagination, login parsing, IP skipping, and geo failure handling.
    • Mocked S3 integration script (test_cloudtrail_integration.py): Verifies end-to-end flow (fetch, parse, username extraction, user/logins return) — runs successfully locally.
  • Documentation:
    • New docs/ingestion/cloudtrail.md with setup, config, testing, troubleshooting, and future ideas.

How to Test

  1. Local Mock (No AWS needed):
    • Run python test_cloudtrail_integration.py → Confirms S3 fetch, parsing, and username extraction work (output: Users found: ['alice', 'bob']; Alice logins: [1 login]).
  2. Real AWS:
    • Download GeoLite2-City.mmdb from MaxMind and mount it (./GeoLite2-City.mmdb:/etc/buffalogs/GeoLite2-City.mmdb:ro in docker-compose.override.yml).
    • Configure real bucket/credentials in ingestion.json.
    • Set "active_ingestion": "cloudtrail".
    • Restart services: docker compose down && docker compose up -d.
    • Trigger analysis: python manage.py impossible_travel.
    • Check logs for "Processing users..." and UI for ingested logins/anomalies.
  3. Unit Tests:
    • python manage.py test impossible_travel.tests.ingestion -v 2 → All 4 tests pass.

Verification

  • Local Testing: Unit tests pass (4/4). Mocked integration test succeeds with alice/bob users and alice's login returned.
  • Edge Cases: Verified IP skipping, no geo (warning logged, events skipped), and AssumeRole username extraction.
  • Linters: Passed black, flake8, isort on modified files.

Notes

  • Geo enrichment is optional — events skip if no country found (configurable if needed).
  • Focuses on management/identity events (no data events).
  • Future: CloudTrail Lake support or AS/ISP enrichment (ideas in docs).

Ready for review and merge. Thanks for the feedback loop — happy to iterate!

@R1sh0bh-1
Copy link
Contributor Author

Hi @Lorygold,

I've added boto3>=1.35.0 and geoip2>=4.8.0 to django-buffalogs/setup.cfg, but CI is failing with ModuleNotFoundError: No module named 'boto3'. Locally it works after rebuilding the image seems CI isn't picking up the setup.cfg changes.

Could you please guide how to make CI install the new deps? Happy to add a fix if needed. Thanks!

@Lorygold
Copy link
Contributor

Hi @R1sh0bh-1 for the CI errors, try to add the libraries into the requirements_opt.txt file and remove them from the setup.cfg file

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE]: Integrating AWS CloudTrail ingestion source

2 participants