Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
295 changes: 295 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,295 @@
# Http11Probe — AI Agent Contribution Guide

This file is designed for LLM/AI agent consumption. It contains precise, unambiguous instructions for adding a new test or a new framework to the Http11Probe platform.

## Project overview

Http11Probe is an HTTP/1.1 compliance and security tester. It sends raw TCP requests to servers and validates responses against RFC 9110/9112. The codebase is C# / .NET 10. Documentation is a Hugo + Hextra static site under `docs/`.

---

## TASK A: Add a new test

Adding a test requires changes to **4 locations** (sometimes 3 if URL mapping is automatic).

### Step 1 — Add the test case to the suite file

Choose the correct suite file based on category:

| Category | File path |
|----------|-----------|
| Compliance | `src/Http11Probe/TestCases/Suites/ComplianceSuite.cs` |
| Smuggling | `src/Http11Probe/TestCases/Suites/SmugglingSuite.cs` |
| Malformed Input | `src/Http11Probe/TestCases/Suites/MalformedInputSuite.cs` |
| Normalization | `src/Http11Probe/TestCases/Suites/NormalizationSuite.cs` |

Append a `yield return new TestCase { ... };` inside the `GetTestCases()` method. Here is the full schema:

```csharp
yield return new TestCase
{
// REQUIRED fields
Id = "COMP-EXAMPLE", // Unique ID. Prefix conventions below.
Description = "What this test checks", // One-line human description.
Category = TestCategory.Compliance, // Compliance | Smuggling | MalformedInput | Normalization
PayloadFactory = ctx => MakeRequest( // Builds the raw HTTP bytes to send.
$"GET / HTTP/1.1\r\nHost: {ctx.HostHeader}\r\n\r\n"
),
Expected = new ExpectedBehavior // How to validate the response. See below.
{
ExpectedStatus = StatusCodeRange.Exact(400),
},

// OPTIONAL fields
RfcReference = "RFC 9112 §5.1", // Use § not "Section". Omit if no RFC applies.
Scored = true, // Default true. Set false for MAY/informational tests.
AllowConnectionClose = false, // On Expected. See validation rules below.
FollowUpPayloadFactory = ctx => ..., // Second request on same connection (pipeline tests).
RequiresConnectionReuse = false, // True if FollowUpPayloadFactory needs same TCP connection.
BehavioralAnalyzer = (response) => ..., // Optional Func<HttpResponse?, string?> for analysis notes.
};
```

**Test ID prefix conventions:**

| Prefix | Suite |
|--------|-------|
| `COMP-` | Compliance |
| `SMUG-` | Smuggling |
| `MAL-` | Malformed Input |
| `NORM-` | Normalization |
| `RFC9112-X.X-` or `RFC9110-X.X-` | Compliance (maps directly to an RFC section) |

**Validation patterns — choose ONE:**

Pattern 1 — Exact status, no alternatives:
```csharp
Expected = new ExpectedBehavior
{
ExpectedStatus = StatusCodeRange.Exact(400),
}
```
Use for strict MUST-400 requirements (e.g. SP-BEFORE-COLON, MISSING-HOST, DUPLICATE-HOST, OBS-FOLD, CR-ONLY).

Pattern 2 — Status with connection close as alternative:
```csharp
Expected = new ExpectedBehavior
{
ExpectedStatus = StatusCodeRange.Exact(400),
AllowConnectionClose = true,
}
```
Use when close is acceptable instead of a status code.

Pattern 3 — Custom validator (takes priority over ExpectedStatus):
```csharp
Expected = new ExpectedBehavior
{
CustomValidator = (response, state) =>
{
if (state == ConnectionState.ClosedByServer && response is null) return TestVerdict.Pass;
if (response is null) return TestVerdict.Fail;
if (response.StatusCode == 400) return TestVerdict.Pass;
if (response.StatusCode >= 200 && response.StatusCode < 300) return TestVerdict.Warn;
return TestVerdict.Fail;
},
Description = "400 or close = pass, 2xx = warn",
}
```
Use for pass/warn/fail logic, timeout acceptance, or multi-outcome tests.

**Available StatusCodeRange factories:**
- `StatusCodeRange.Exact(int code)` — single status code
- `StatusCodeRange.Range(int start, int end)` — inclusive range
- `StatusCodeRange.Range2xx` — 200-299
- `StatusCodeRange.Range4xx` — 400-499
- `StatusCodeRange.Range4xxOr5xx` — 400-599

**Available TestVerdict values:** `Pass`, `Fail`, `Warn`, `Skip`, `Error`

**Available ConnectionState values:** `Open`, `ClosedByServer`, `TimedOut`, `Error`

**Helper method available in all suites:**
```csharp
private static byte[] MakeRequest(string request) => Encoding.ASCII.GetBytes(request);
```

**Critical rules:**
- NEVER set `AllowConnectionClose = true` for MUST-400 requirements where the RFC explicitly says "respond with 400".
- Set `Scored = false` only for MAY-level or purely informational tests.
- Always use `ctx.HostHeader` (not a hardcoded host) in payloads.
- Tests are auto-discovered — no registration step needed. The `GetTestCases()` yield return is sufficient.

### Step 2 — Add docs URL mapping (conditional)

**File:** `src/Http11Probe.Cli/Reporting/DocsUrlMap.cs`

This step is **only needed** for `COMP-*` and `RFC*` prefixed tests. The following prefixes are auto-mapped:
- `SMUG-XYZ` → `smuggling/xyz` (lowercased)
- `MAL-XYZ` → `malformed-input/xyz` (lowercased)
- `NORM-XYZ` → `normalization/xyz` (lowercased)

For compliance tests, add an entry to the `ComplianceSlugs` dictionary:
```csharp
["COMP-EXAMPLE"] = "headers/example",
```

If the doc filename doesn't match the auto-mapping convention, add to `SpecialSlugs` instead:
```csharp
["MAL-CHUNK-EXT-64K"] = "malformed-input/chunk-extension-long",
```

### Step 3 — Create the documentation page

**File:** `docs/content/docs/{category-slug}/{test-slug}.md`

Category slug mapping:

| Category | Slug |
|----------|------|
| Compliance (line endings) | `line-endings` |
| Compliance (request line) | `request-line` |
| Compliance (headers) | `headers` |
| Compliance (host header) | `host-header` |
| Compliance (content-length) | `content-length` |
| Compliance (body) | `body` |
| Compliance (upgrade) | `upgrade` |
| Smuggling | `smuggling` |
| Malformed Input | `malformed-input` |
| Normalization | `normalization` |

Use this exact template:

```markdown
---
title: "EXAMPLE"
description: "EXAMPLE test documentation"
weight: 1
---

| | |
|---|---|
| **Test ID** | `COMP-EXAMPLE` |
| **Category** | Compliance |
| **RFC** | [RFC 9112 §X.X](https://www.rfc-editor.org/rfc/rfc9112#section-X.X) |
| **Requirement** | MUST |
| **Expected** | `400` or close |

## What it sends

A request with [description of the non-conforming element].

\```http
GET / HTTP/1.1\r\n
Host: localhost:8080\r\n
[malformed element]\r\n
\r\n
\```

## What the RFC says

> "Exact quote from the RFC with the MUST/SHOULD/MAY keyword." -- RFC 9112 Section X.X

Explanation of what the quote means for this test.

## Why it matters

Security and compatibility implications. Why this matters for real-world deployments.

## Sources

- [RFC 9112 §X.X](https://www.rfc-editor.org/rfc/rfc9112#section-X.X)
```

**Requirement field values:** `MUST`, `SHOULD`, `MAY`, `"ought to"`, `Implicit MUST (grammar violation)`, `Unscored`, or a descriptive phrase like `MUST reject or replace with SP`.

### Step 4 — Add a card to the category index

**File:** `docs/content/docs/{category-slug}/_index.md`

Find the `{{</* cards */>}}` block and add a new card entry. Place scored tests before unscored tests.

```
{{</* card link="example" title="EXAMPLE" subtitle="Short description of what the test checks." */>}}
```

The `link` value is the filename without `.md`.

### Verification checklist

After making all changes:

1. `dotnet build Http11Probe.slnx -c Release` — must compile without errors.
2. The new test ID appears in the output of `dotnet run --project src/Http11Probe.Cli -- --host localhost --port 8080`.
3. Hugo docs render: `cd docs && hugo server` — the new page is accessible and linked from its category index.

---

## TASK B: Add a new framework

Adding a framework requires creating **3 files** in a new directory. No existing files need modification.

### Step 1 — Create the server directory

Create a new directory: `src/Servers/YourServer/`

### Step 2 — Implement the server

Your server MUST listen on **port 8080** and implement these endpoints:

| Endpoint | Method | Behavior |
|----------|--------|----------|
| `/` | `GET` | Return `200 OK` |
| `/` | `HEAD` | Return `200 OK` with no body |
| `/` | `POST` | Read the full request body and return it in the response body |
| `/` | `OPTIONS` | Return `200 OK` |
| `/echo` | `POST` | Return all received request headers in the response body, one per line as `Name: Value` |

The `/echo` endpoint is critical for normalization tests. It must echo back all headers the server received, preserving the names as the server internally represents them.

Example `/echo` response body:
```
Host: localhost:8080
Content-Length: 11
Content-Type: text/plain
```

### Step 3 — Add a Dockerfile

Create `src/Servers/YourServer/Dockerfile` that builds and runs the server.

Key requirements:
- The container runs with `--network host`, so bind to `0.0.0.0:8080`.
- Use `ENTRYPOINT` (not `CMD`) for the server process.
- The Dockerfile build context is the repository root, so paths like `COPY src/Servers/YourServer/...` are correct.

Example:
```dockerfile
FROM python:3.12-slim
WORKDIR /app
RUN pip install --no-cache-dir flask
COPY src/Servers/YourServer/app.py .
ENTRYPOINT ["python3", "app.py", "8080"]
```

### Step 4 — Add probe.json

Create `src/Servers/YourServer/probe.json` with exactly one field:

```json
{"name": "Your Server Display Name"}
```

This name appears in the leaderboard and PR comments.

### Verification checklist

1. Build the Docker image: `docker build -f src/Servers/YourServer/Dockerfile -t yourserver .`
2. Run: `docker run --network host yourserver`
3. Verify endpoints:
- `curl http://localhost:8080/` returns 200
- `curl -X POST -d "hello" http://localhost:8080/` returns "hello"
- `curl -X POST -d "test" http://localhost:8080/echo` returns headers
4. Run the probe: `dotnet run --project src/Http11Probe.Cli -- --host localhost --port 8080`

No changes to CI workflows, configs, or other files are needed. The pipeline auto-discovers servers from `src/Servers/*/probe.json`.
11 changes: 10 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,15 @@

All notable changes to Http11Probe are documented in this file.

## [Unreleased]
## [2026-02-14]

### Added
- **RFC Requirement Dashboard** — all 148 tests classified by RFC 2119 level (MUST/SHOULD/MAY/"ought to"/Unscored/N/A) with exact RFC quotes proving each classification (`docs/content/docs/rfc-requirement-dashboard.md`)
- **Add a Test guide** — step-by-step documentation for contributing new tests to the platform (`docs/content/add-a-test.md`)
- **AI Agent contribution guide** — machine-readable `AGENTS.md` at repo root with precise instructions for LLM agents to add tests or frameworks (`docs/content/add-with-ai-agent.md`)
- **Contribute menu** — top nav "Add a Framework" replaced with a "Contribute" dropdown containing Add a Framework, Add a Test, and Add with AI Agent
- **Landing page cards** — RFC Requirement Dashboard card in hero section; Add a Test and Add with AI Agent cards in Contribute section
- **Glossary card** — RFC Requirement Dashboard linked from the glossary index page
- **Server configuration pages** — per-server docs pages showing Dockerfile, source code, and config files for all 36 tested servers (`docs/content/servers/`)
- **Clickable server names** — server names in the probe results table and summary bar chart now link to their configuration page
- **Sticky first column** — server name column stays pinned to the left edge while scrolling horizontally through result tables
Expand Down Expand Up @@ -33,6 +39,9 @@ All notable changes to Http11Probe are documented in this file.
- **Stronger sticky shadow on mobile** — increased shadow intensity for the pinned server name column
- **Scrollable tooltips** — hover tooltips are now interactive and scrollable for large payloads (removed `pointer-events:none`, increased `max-height` to `60vh`)
- **Larger click modal** — expanded from `max-width:700px` to `90vw` and `max-height` from `80vh` to `85vh` to better accommodate large request/response data
- **Landing page section rename** — "Add Your Framework" heading renamed to "Contribute to the Project" with updated copy emphasizing community contributions
- **Uniform card sizing** — CSS rule forces all home page card grids to `repeat(2, 1fr)` so every card is the same width
- **Sidebar reordering** — RFC Requirement Dashboard at weight 2 (after Understanding HTTP), RFC Basics bumped to 3, Baseline to 4
- **Kestrel HEAD/OPTIONS support** — added explicit HEAD and OPTIONS endpoint handlers to ASP.NET Minimal server so smuggling tests evaluate correctly instead of returning 405
- **Add a Framework docs** — documented HEAD and OPTIONS as required endpoints
- Raw request capture now includes truncation metadata when payload exceeds 8,192 bytes (`TestRunner.cs`)
Expand Down
6 changes: 6 additions & 0 deletions docs/assets/css/custom.css
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,12 @@ html.dark .hextra-content {
background: transparent !important;
}

/* ── Home page: uniform card grid ────────────────────────────── */

.hextra-home-body .hextra-cards {
grid-template-columns: repeat(2, 1fr) !important;
}

/* ── Cards ───────────────────────────────────────────────────── */

.hextra-card {
Expand Down
11 changes: 8 additions & 3 deletions docs/content/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ layout: hextra-home

{{< cards >}}
{{< card link="probe-results" title="Leaderboard" subtitle="See which frameworks pass the most tests, ranked from best to worst compliance." icon="chart-bar" >}}
{{< card link="docs/rfc-requirement-dashboard" title="RFC Requirement Dashboard" subtitle="All 148 tests classified by RFC 2119 level (MUST/SHOULD/MAY)." icon="document-search" >}}
{{< /cards >}}

<div style="height:60px"></div>
Expand All @@ -45,15 +46,19 @@ Http11Probe sends a suite of crafted HTTP requests to each server and checks whe

<div style="height:60px"></div>

<h2 style="font-size:2rem;font-weight:800;">Add Your Framework</h2>
<h2 style="font-size:2rem;font-weight:800;">Contribute to the Project</h2>

<div style="height:16px"></div>

Http11Probe is designed so anyone can contribute their HTTP server and get compliance results without touching the test infrastructure. Just add a Dockerfile, a one-line `probe.json`, and open a PR.
Http11Probe is open source and built for contributions. Add your HTTP server to the leaderboard, or write new test cases to expand coverage.

Every new framework added makes the comparison more useful for the entire community, and every new test strengthens the compliance bar for all servers on the platform. If you've found an edge case that isn't covered, or you maintain a framework that isn't listed yet, your contribution directly improves HTTP security and interoperability for everyone.

<div style="height:20px"></div>

{{< cards >}}
{{< card link="add-a-framework" title="Get Started" subtitle="Three steps to add your framework — Dockerfile, probe.json, and open a PR." icon="plus-circle" >}}
{{< card link="add-a-framework" title="Add a Framework" subtitle="Three steps to add your framework — Dockerfile, probe.json, and open a PR." icon="plus-circle" >}}
{{< card link="add-a-test" title="Add a Test" subtitle="How to define a new test case, write its documentation, and wire it into the platform." icon="beaker" >}}
{{< card link="add-with-ai-agent" title="Add with AI Agent" subtitle="Use an AI coding agent to add a test or framework using the machine-readable AGENTS.md guide." icon="chip" >}}
{{< /cards >}}

Loading
Loading