A friendly Lambda benchmark. The harness pre-fills an S3 bucket with 3 000 random objects (~15 GB total) and exposes a Step Function that, for every registered contender Lambda, measures how long it takes to read every object and produce a single valid ZIP archive — in one Lambda invocation, within strict resource limits.
You bring the Lambda; the harness times it, validates its archive byte-for-byte, and ranks it against the others.
This repository is the companion code to a blog post on streaming S3 archive creation in Rust on Lambda. The reference implementation is mine.
Table of Contents
On stack creation, a custom resource (fill-bucket) populates an S3 bucket
with 3 000 objects under the files/ prefix. Each object has a uniformly
random body sized from N(5 MB, 1 MB) clamped to [2 MB, 8 MB], and its
S3 key is the SHA256 hex of its content.
A Step Function takes a list of contender Lambda ARNs as input and, for each one in parallel:
- Derives the per-contender archive key base from the function name.
- Invokes the contender Lambda
RunsPerContendertimes (default 10) in parallel, each run writing to its own distinct key (archives/<lang>-<dev_id>-<runIndex>.zip), timing each invocation independently. - If any run fails (crash or timeout), the contender is reported as failed.
- Otherwise invokes the internal
controlLambda on run-0's archive, which streams the produced ZIP back from S3 and validates it (flat layout, one entry per source object, entry name equals SHA256 of decompressed content). - Deletes all per-run archives.
The reported duration_ms is the mean across the RunsPerContender runs.
duration_ms_stddev, duration_ms_min, and duration_ms_max capture the
variability. All pricing fields are derived from the mean duration.
The execution output is a single JSON document with two lists:
{
"success": [
{
"arn": "arn:aws:lambda:...:function:demo-s3-archiving-rust-jrodon",
"runtime": "provided.al2023",
"architecture": "arm64",
"memory_mb": 512,
"ephemeral_storage_mb": 512,
"runs_count": 10,
"duration_ms": 212631,
"duration_ms_stddev": 3241,
"duration_ms_min": 207834,
"duration_ms_max": 219102,
"gb_second_compute": 106.3155,
"gb_second_storage": 0,
"compute_rate_usd": 0.0000133334,
"storage_rate_usd": 0.0000000309,
"compute_price_usd": 0.001417537,
"storage_price_usd": 0,
"run_price_usd": 0.001417537
}
],
"failure": [
{ "arn": "arn:aws:lambda:...:function:demo-s3-archiving-python-someone", "reason": "invalid: content hash mismatch for '...': computed ..." }
]
}Ranking is by run_price_usd ascending — the real Lambda invocation cost,
accounting for architecture (arm64 vs x86_64) and ephemeral storage above the
512 MB free tier. Pricing constants are hardcoded in the LoadPricing state
of templates/benching.asl.json.
The number and average size of test objects (TestFileCount, TestFileSize)
and the number of parallel runs per contender (RunsPerContender) are
CloudFormation parameters of the benching stack — override them on stack
update if you want to play with the harness, but please don't include that in
your PRs.
Two stacks are involved: a CI/CD stack (<ProjectName>-ci, deployed manually
once) and the actual benchmark stack (<ProjectName>-root, deployed by the
CI pipeline). You only ever deploy the first one yourself.
- An AWS account with permission to create CloudFormation, CodePipeline, CodeBuild, IAM, S3, Lambda, Step Functions and CloudWatch Logs resources.
- A GitHub account.
Fork this repository on GitHub. Note the resulting ID <your-username>/demo-s3-archiving;
you will need it (case-sensitive) below.
If you already have one, reuse it and skip to step 3.
- Open the CodePipeline console > Settings > Connections, choose GitHub as provider, name the connection, click Connect to GitHub.
- Authorize AWS to act on your GitHub account, pick the AWS-created GitHub App, click Connect.
- Copy the connection ARN.
Make sure your AWS console is set to the region you want to use throughout — the same region must be used for every subsequent step.
aws cloudformation create-stack \
--stack-name demo-s3-archiving-ci \
--template-body file://ci-template.yml \
--parameters \
ParameterKey=ProjectName,ParameterValue=demo-s3-archiving \
ParameterKey=CodeStarConnectionArn,ParameterValue=YOUR_CONNECTION_ARN \
ParameterKey=ForkedRepoId,ParameterValue=YOUR_USERNAME/demo-s3-archiving \
--capabilities CAPABILITY_NAMED_IAM CAPABILITY_AUTO_EXPANDOr, via the console: create a new stack from ci-template.yml, fill the
parameters above, acknowledge IAM resource creation, submit.
The pipeline kicks off automatically. It builds every Rust Lambda, packages
the templates, then deploys the demo-s3-archiving-root stack which creates
the bucket, runs fill-bucket, and exposes the Step Function. First run takes
~10–20 minutes (cold cargo build); subsequent updates a couple of minutes
thanks to incremental caching.
Once the root stack is CREATE_COMPLETE, fetch its outputs and start an
execution:
CONTENDERS=$(aws cloudformation describe-stacks --stack-name demo-s3-archiving-root \
--query 'Stacks[0].Outputs[?OutputKey==`ContenderArns`].OutputValue' --output text)
SM=$(aws cloudformation describe-stacks --stack-name demo-s3-archiving-root \
--query 'Stacks[0].Outputs[?OutputKey==`BenchingStateMachineArn`].OutputValue' --output text)
aws stepfunctions start-execution \
--state-machine-arn "$SM" \
--input "$CONTENDERS"ContenderArns is already a JSON object of the shape the state machine expects
({"contenders":[...]}), so it can be passed as-is. Watch the execution in
the Step Functions console and read the ranked output from its final state.
The order matters — the root stack uses an IAM role created by the CI stack:
- Delete
demo-s3-archiving-rootfirst. Wait forDELETE_COMPLETE. - Then delete
demo-s3-archiving-ci.
Deleting them in parallel will fail and is annoying to unwind.
The repository ships with one reference contender:
contenders/rust/jrodon/. It is also
the copy-paste template for new ones in Rust. If you want to beat it (or just write
one in another language), the contract below is everything you need to know.
Event — your handler is invoked with one JSON object:
{
"bucket_name": "<project>-<account>-<region>",
"files_prefix": "files",
"archive_key": "archives/<lang>-<dev_id>.zip"
}Read all three fields from the event. The benching Step Function
injects the right values for every invocation (see templates/benching.asl.json).
| Field | Meaning |
|---|---|
bucket_name |
S3 bucket holding both the source objects and your output archive |
files_prefix |
Key prefix of the source objects, no trailing slash (default files) |
archive_key |
Destination key your produced ZIP must be uploaded to |
What your Lambda must do:
- List and read every object under
s3://${bucket_name}/${files_prefix}/. - Produce a ZIP archive and upload it to
s3://${bucket_name}/${archive_key}.
Archive constraints — the control Lambda rejects anything else:
- Flat layout — no
/in any entry name. - Exactly one entry per source object, no duplicates, no extras.
- Entry name == source object's S3 key basename == SHA256 hex of decompressed content.
- Bit-exact content (the control Lambda re-hashes and compares).
Failure modes are surfaced verbatim in the state machine output:
| Cause | Reported reason |
|---|---|
| Lambda crashed | crash: <Error / Cause> |
| Lambda timed out | timeout: <Error / Cause> |
| Nested path in archive | invalid: archive contains nested path '...', flat layout required |
| Hash mismatch | invalid: content hash mismatch for '...': computed ... |
| Unknown or duplicate entry | invalid: unknown or duplicate object in archive: '...' |
| Missing source objects | invalid: archive missing N expected object(s) (sample: [...]) |
IAM — every contender shares the LambdaContenderRole defined in
templates/contenders.yml. It grants:
s3:GetObjectands3:ListBucketon the source bucket, scoped to the configured files prefix.s3:PutObject,s3:AbortMultipartUpload,s3:ListMultipartUploadParts,s3:ListBucketMultipartUploadson<bucket>/archives/*.- Standard CloudWatch Logs.
Resource limits — yours to set. The reference contender uses
provided.al2023, ARM64, 512 MB of memory, 600 s timeout. Bumping memory
or switching architecture is fair game; but the winner will
be the cheapest run by Lambda pricing.
There is a strict naming scheme tying together the source directory, the Lambda function name, and (for Rust) the cargo package name. Pick:
<lang>: short name of your language (rust,python,go,java, …). Lowercase, hyphen-safe.<dev_id>: your GitHub username (or any stable identifier you control). Same charset.
| Where | Value | Example |
|---|---|---|
| Source directory | contenders/<lang>/<dev_id>/ |
contenders/rust/jrodon/ |
| Cargo package name (Rust only) | <dev_id> (must equal directory name) |
jrodon |
| Lambda function name | ${ProjectName}-<lang>-<dev_id> |
demo-s3-archiving-rust-jrodon |
| CFN logical ID prefix | <Lang><DevId> (PascalCase, no hyphens) |
RustJeremieRodon |
Adding a contender is three edits:
1. Drop your sources under contenders/<lang>/<dev_id>/. For Rust, the
workspace at the repo root already includes contenders/rust/*, so the
crate is picked up automatically — but its [package].name MUST equal
<dev_id> (the CI uses it to locate the compiled binary).
2. Add two resources in templates/contenders.yml,
inside the BEGIN/END CONTENDERS markers. Copy the RustJeremieRodonFunction
block and adapt logical IDs, FunctionName, CodeUri, Runtime, Handler,
Architectures. Common runtime/handler pairs:
| Language | Runtime |
Handler |
|---|---|---|
| Rust / Go / any compiled language | provided.al2023 |
ignored (bootstrap is executed) |
| Python 3.13 | python3.13 |
index.handler |
| Node.js 22.x | nodejs22.x |
index.handler |
| Java 21 | java21 |
com.example.MyHandler::handleRequest |
3. Add one line in Outputs.ContenderArns of the same file, after the
INSERT YOUR CONTENDER ARN HERE marker:
- !GetAtt <Lang><DevId>Function.ArnThat's it for the registration. If your language doesn't have a build step in the CI yet, you also need to touch the buildspec — see below.
ci-config/buildspec.yml handles two languages
out of the box:
- Rust: every crate under
contenders/rust/*is compiled bycargo lambda build --locked --release --arm64. The compiledbootstrapbinary then replaces the source directory before packaging. - Python: every directory under
contenders/python/*is scanned for arequirements.txt; if present, deps are installed in place withpip install -r requirements.txt -t ..
For any other language, add a build step in the build phase of the
buildspec. The contract is simple: when the post_build phase runs
aws cloudformation package, the directory at contenders/<lang>/<dev_id>/
must contain exactly what the Lambda runtime expects to find — a bootstrap
binary for compiled languages, a fat JAR for Java, transpiled JS for Node, etc.
A commented-out Go example is provided as a starting point under the
# GO BUILD marker.
Important: the CI replaces the contender source directory with the build output before zipping. Don't rely on extra files (sources, configs) being present in the Lambda's runtime filesystem unless your build step explicitly keeps them.
If your implementation flies, no matter its performances, please submit a PR!
- Fork the repo, branch as
contender/<lang>-<dev_id>. - Make the three edits above (plus a buildspec change if needed).
- Open the PR. Useful things to mention in the description:
- your
<lang>-<dev_id>; - your approach (compression level, streaming strategy, concurrency model);
- any non-default resource setting (memory, timeout, architecture).
- your
- Once merged, the CI redeploys and your contender shows up in the next benchmark run.
root-template.yml # Root CF stack — nests benching, then contenders
ci-template.yml # CI/CD: CodePipeline, CodeBuild, artifact bucket, IAM
templates/
benching.yml # Bucket, fill-bucket + control Lambdas, Step Function
benching.asl.json # Step Function definition (JSONata)
contenders.yml # ← contributors register their Lambda here
benching/
fill-bucket/ # CFN custom resource: fills the bucket on stack create
control-lambda/ # Archive validator, invoked by the Step Function
contenders/
rust/<dev_id>/ # one Cargo crate per Rust contender (crate name == <dev_id>)
python/<dev_id>/ # optional Python contenders
<lang>/<dev_id>/ # add new languages by creating a new sub-directory
ci-config/
buildspec.yml # Builds every internal + contender Lambda; packages CF templates
nix/ # Rust toolchain + dev shell (optional)
The internal Lambdas and all Rust contenders live in a single cargo workspace
(see the root Cargo.toml). One cargo check validates
everything.
Distributed under the GPL-3.0-only License. See LICENSE for the
full text.
Jérémie RODON (@JeremieRodon) — RustyServerless
Project link: https://github.com/RustyServerless/demo-s3-archiving