Skip to content

feat(assets-sync): add assets.toml content-type overrides#64

Merged
lwshang merged 2 commits into
mainfrom
lwshang/assets_toml
Jun 1, 2026
Merged

feat(assets-sync): add assets.toml content-type overrides#64
lwshang merged 2 commits into
mainfrom
lwshang/assets_toml

Conversation

@lwshang
Copy link
Copy Markdown
Collaborator

@lwshang lwshang commented May 28, 2026

Superseded by #66. The dedicated assets.toml config was scrapped in favour of folding per-glob Content-Type overrides into _headers — same feature, single familiar file, no new format to learn. See #66 for the design rationale and the replacement implementation.

Summary (what this PR would have done)

Added an opt-in assets.toml config file (passed inline via the manifest's files: field) that let users override an asset's Content-Type by glob pattern. v1 scoped to this one field; broader [[asset]] knobs (ignore, encodings, allow_raw_access) were sketched in ASSETS-TOML.md for follow-ups.

Replaced by #66 which expresses the same overrides inside _headers (e.g. /*.did\n Content-Type: text/plain; charset=utf-8). The plugin extracts Content-Type from _headers and routes it into CreateAssetArguments.content_type so the canister still emits exactly one certified Content-Type per response.

🤖 Generated with Claude Code

@lwshang lwshang force-pushed the lwshang/headers_globs branch from d5262f2 to 112997c Compare June 1, 2026 19:39
lwshang and others added 2 commits June 1, 2026 15:43
New plugin-side config file (passed via `SyncExecInput.files`) lets
users override asset content-type by glob pattern:

    [[asset]]
    match = "/*.md"
    content_type = "text/markdown; charset=utf-8"

Resolution walks `[[asset]]` blocks in declaration order — first
matching `content_type` wins; falls back to `mime_guess::from_path`
otherwise. The override is applied in `prepare_asset` before
`encoders_for` runs, so a `.did` declared as `text/plain` also picks
up gzip compression and stores the right type for cert-tree drift
detection. Result feeds `CreateAssetArguments.content_type` so the
canister emits exactly one `Content-Type` in the certified response —
closes the duplicate-header pitfall of the legacy `.ic-assets.json5`
`headers.Content-Type` workaround.

Also lifts `HeaderPattern` from `headers.rs` into a shared
`KeyPattern` in `glob.rs` since both `_headers` rules and
`assets.toml` blocks match against asset keys with the same dialect.
Pure refactor; `_headers` semantics unchanged.

Strict v1 schema: `#[serde(deny_unknown_fields)]` on both the top
level and per-block — typo protection beats forward compat for an
unreleased plugin. The standard filename `assets.toml` is convention,
not enforced — the manifest's `files:` field is the authoritative
declaration of which file is the asset config.

Design rationale and parked open questions in `ASSETS-TOML.md`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two scenarios on a fixture that mirrors the developer-docs case:

1. `content_type_overrides_land_on_canister` — deploy with overrides
   for `/*.did`, `/*.sh`, `/llms.txt` and verify the HTTP gateway
   returns the configured `Content-Type` exactly (proves the override
   survived certification, since the gateway validates the
   `IC-Certificate` before responding).

2. `content_type_edit_propagates_on_redeploy` — flip an override and
   redeploy without changing the asset bytes; verify the new
   `Content-Type` reaches clients (proves drift detection triggers
   delete-then-recreate when content-type changes).

Also documents the feature in the top-level README under "Per-glob
content-type overrides" and refreshes the `_headers` paragraph to
mention the now-supported glob syntax (`/*.md`, mid-path wildcards).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@lwshang lwshang marked this pull request as ready for review June 1, 2026 19:43
@lwshang lwshang requested a review from a team as a code owner June 1, 2026 19:43
@lwshang lwshang changed the base branch from lwshang/headers_globs to main June 1, 2026 19:43
@lwshang lwshang force-pushed the lwshang/assets_toml branch from 50b2714 to 2aee1e4 Compare June 1, 2026 19:43
@lwshang lwshang enabled auto-merge (squash) June 1, 2026 19:43
@lwshang lwshang merged commit 6452b5f into main Jun 1, 2026
11 checks passed
lwshang added a commit that referenced this pull request Jun 1, 2026
…oml (#66)

## Summary

Removes the `assets.toml` config introduced in #64 and folds the only
feature it carried — per-glob `Content-Type` overrides — back into
`_headers`. The plugin now recognises `Content-Type:` inside any
`_headers` block, parses the value as a MIME, and routes it to
`CreateAssetArguments.content_type` rather than the appended response
headers — so the canister still emits exactly one certified
`Content-Type` per response.

Net diff: −634 / +287 across 21 files. Drops a config format, a parser
module, an e2e fixture, a top-level design doc, and the `toml` workspace
dep.

## Why Content-Type belongs in `_headers`, not its own file

#64 sat on the position that `Content-Type` was *asset metadata*, not a
response header, and therefore deserved its own file. That framing was
driven by the canister's wire shape (appends `_headers` without dedup,
so a `Content-Type:` rule would produce two on the wire) — i.e. an
*implementation* concern, not a user concern.

- **Users' mental model wins.** Every static-host tool (Netlify,
Cloudflare Pages) lets users set `Content-Type` in `_headers`. To a
user, it's just a header — splitting it across two files to satisfy our
pipeline is the kind of design that makes them say "why is this weird."
- **The wire problem is plumbing, not architecture.** Solving it in the
plugin (extract `Content-Type` from `_headers`, route to `content_type`,
exclude from the appended list) keeps the canister's
append-without-dedup invariant intact *and* gives the user the familiar
one-file experience.
- **The other speculative `assets.toml` fields didn't survive review.**
`ignore` belongs in the build step (don't put the file in `dist/` if it
shouldn't deploy). `encodings` has no driving use case — the current
`text/* + js/html → identity+gzip, everything else → identity` policy
covers real projects. `allow_raw_access` is being dropped on the
canister side (selective certification via the HTTP gateway spec covers
what it was for). With no remaining fields, the config file has no
reason to exist.

The pivot rests on a principle: *adding a config format later when a
real need shows up is cheaper than carrying an unused format now.*

## Other design points

- **Parser shape.** `HeaderRule` gains `content_type: Option<Mime>`.
Parsing a `Content-Type:` line is case-insensitive, validates as `Mime`,
rejects empty values, and rejects duplicates within the same block.
Other headers in the same block continue to flow through `headers` as
usual ([`assets-sync/src/headers.rs`](assets-sync/src/headers.rs)).
- **Resolver.** New `content_type_for(key, rules)` walks rules in
declaration order and returns the first matching `content_type` —
first-match-wins because `Content-Type` is single-valued (accumulation
semantics make no sense for it). Other matching rules still contribute
their non-`Content-Type` headers
([`assets-sync/src/headers.rs`](assets-sync/src/headers.rs)).
- **Pipeline.** `prepare_asset` now takes `&[HeaderRule]` (instead of
the deleted `&AssetConfig`) and applies `content_type_for` before
`encoders_for` — so a `.did` declared as `text/plain` still picks up
gzip and still triggers drift detection on re-deploy. The
`header_content_type_override_applies_to_prepare_asset` unit test pins
this end-to-end ([`assets-sync/src/sync.rs`](assets-sync/src/sync.rs)).
- **`sync()` signature.** Lost its `files: &[(String, String)]`
parameter — no consumer remains. The plugin entry no longer threads
`input.files` through; if a future feature needs inline files, it just
reads them.
- **Deletions.**
[`assets-sync/src/asset_config.rs`](assets-sync/src/asset_config.rs),
`ASSETS-TOML.md`, the `toml` workspace dep, the `assets-toml` e2e
fixture, and the assets-toml-only fields from the developer-docs
cargo-cult example (the e2e fixture now demonstrates `.did` only, the
one extension `mime_guess` genuinely can't classify).

## Test plan

New coverage in [`headers.rs`](assets-sync/src/headers.rs) tests:
Content-Type routes to dedicated field not headers, case-insensitive,
coexists with other headers, block-with-only-Content-Type is valid,
rejects duplicate / invalid / empty values; `content_type_for` returns
None / first-match-wins / skips rules without a `content_type`
declaration.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@lwshang lwshang deleted the lwshang/assets_toml branch June 1, 2026 20:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants