Skip to content

New use case - Media Asset Workbench#171

Open
cod-all wants to merge 2 commits intoaws-samples:masterfrom
cod-all:master
Open

New use case - Media Asset Workbench#171
cod-all wants to merge 2 commits intoaws-samples:masterfrom
cod-all:master

Conversation

@cod-all
Copy link
Copy Markdown
Contributor

@cod-all cod-all commented Apr 16, 2026

Summary

  • Adds a full media asset processing pipeline demo built on Amazon S3 Files, Amazon DocumentDB, and EC2
  • Deploys a CloudFormation stack (VPC, DocumentDB cluster, EC2 worker, S3 bucket, Secrets Manager, VPC endpoints) via deploy.sh
  • Provides a local FastAPI UI that connects to DocumentDB through an SSM Session Manager port-forward tunnel — no VPN, no bastion, no open inbound ports required

What's included

Infrastructure (infrastructure/cloudformation.yaml)

  • VPC with two public subnets, Internet Gateway, S3 Gateway VPC endpoint
  • Amazon DocumentDB 8.0 cluster (db.t3.medium, storage-encrypted, TLS enforced), credentials auto-generated in Secrets Manager
  • EC2 worker (t3.small, Amazon Linux 2023, encrypted EBS) with IAM role scoped to least-privilege S3 and Secrets Manager actions
  • Separate security groups for worker, DocumentDB, and S3 Files NFS mount target
  • S3 bucket with versioning, public-access block, CORS restricted to localhost:8080, and a 7-day lifecycle on uploads/

Worker (worker/)

  • Polls DocumentDB for pending jobs, claims them atomically with find_one_and_update
  • Walks the S3 Files POSIX mount (/mnt/assets) using os.walk() — zero S3 SDK calls in the processing hot path
  • Image processor: PIL-based EXIF extraction, thumbnail generation, auto-tagging (resolution, color mode)
  • Video processor: ffprobe metadata extraction, ffmpeg thumbnail generation, auto-tagging (codec, fps, duration)
  • Stale-job recovery: resets any running job older than STALE_JOB_TIMEOUT (default 300 s) back to pending

UI (ui/)

  • FastAPI serves REST API + static SPA from a single process on localhost:8080
  • Live-polling asset grid with type/status filters and DocumentDB query display
  • Asset detail panel: thumbnail preview (presigned S3 URLs, 1-hour expiry), metadata, auto-tags, raw DocumentDB document, S3 Files mount path view

Test plan

  • ./deploy.sh completes without error; config.env is populated with all stack outputs
  • ./generate-sample-data.sh + aws s3 sync uploads sample packs to the bucket
  • SSH + userdata.sh mounts S3 Files at /mnt/assets; df -h shows 8.0E size and sample-packs/ is visible
  • SSM port-forward tunnel opens successfully (aws ssm start-session ...)
  • uvicorn app:app --reload --port 8080 starts and connects to DocumentDB
  • Loading a sample pack in the UI triggers worker processing; assets appear live in the grid
  • Image assets show EXIF/dimensions; video assets show codec/fps/duration in the detail panel
  • Thumbnails load via presigned URLs; bucket remains fully private
  • ./cleanup.sh tears down all resources cleanly

cod-all and others added 2 commits April 16, 2026 12:20
@tmcallaghan tmcallaghan self-assigned this Apr 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants