Threat model: how should Thread Context (MDC) keys be classified (trusted structural or untrusted content)? #4132

ppkarwasz · 2026-05-30T13:06:30Z

ppkarwasz
May 30, 2026
Collaborator

Background

I am revising the common threat model (_threat-model-common.adoc) to define data sources by who controls them: configuration (operator-controlled, trusted), structural identifiers / control (developer-controlled, trusted), and content (user-controlled, untrusted).

Most inputs fit one category cleanly. Thread Context (the Log4j ThreadContext MDC/NDC, and the equivalent context maps in Log4net and Log4cxx) does not. We have discussed this in private and now want to settle it in a public venue.

The question

Thread Context values are clearly user-controlled content, hence untrusted. Thread Context keys are ambiguous:

They behave like structural identifiers: developer-chosen, usually constants such as a requestId key. That would make them trusted, and key-based injection a case of application misuse.
But a common (and discouraged) practice is to copy untrusted data wholesale into the context, for example dumping all HTTP headers into the MDC. That would make keys effectively user-controlled, hence untrusted.

How we classify keys decides whether key-based injection or validation-DoS reports are in scope, and what the correct remediation is: neutralize untrusted content, validate and reject trusted input, or document the case as application misuse.

Points to decide

Are MDC keys structural identifiers (developer-controlled, trusted) or content (user-controlled, untrusted)?
Should keys and values be classified differently?
If keys are trusted: should "do not populate Thread Context keys from untrusted input, e.g. raw HTTP header names" be a documented developer responsibility, with violations treated as out of scope?
Does the answer hold equally for Log4j, Log4net, and Log4cxx, given this is the common threat model?

Note

Everyone is welcome to contribute. As usual, opinions expressed by Logging Services PMC members are binding for the project's security policy.

In the meantime, the threat-model refactor will document Thread Context values as untrusted content and mark keys as a known open gap pointing to this discussion.

rm5248 · 2026-05-30T15:53:57Z

rm5248
May 30, 2026
Collaborator

The MDC is not something that I have used often, but I seem to recall reading at some point that the MDC could be used for something along the lines of HTTP headers.

What would the difference between trusted and untrusted values be? If they're trusted, does that mean that we can do certain operations on them(e.g. replacement)?

1 reply

FreeAndNil May 31, 2026
Collaborator

If keys are trusted, the framework may reject a malformed key by throwing rather than sanitizing it, and does not need to escape special characters in keys in structured layouts. Key-based injection would be out of scope, with "do not populate keys from untrusted input" becoming a documented developer responsibility.

FreeAndNil · 2026-05-31T21:04:12Z

FreeAndNil
May 31, 2026
Collaborator

My position is that keys should be classified as untrusted content, on the same level as values.
The intended use - developer-chosen constants - is not the actual use. Dumping HTTP headers into the MDC is common, and developers doing so are often unaware of the trust implications. A threat model should reflect how the API is actually used.
The cost argument also favors this: appenders already sanitize values in structured layouts. Extending that to keys is a small and natural addition. Documenting "do not populate keys from untrusted input" as a developer responsibility would be cleaner in theory, but shifts the burden onto developers who are demonstrably already unaware of the risk.

2 replies

vy Jun 1, 2026
Collaborator

@FreeAndNil has captured it nicely. I agree with all his remarks.

ppkarwasz Jun 2, 2026
Collaborator Author

The cost argument also favors this: appenders already sanitize values in structured layouts. Extending that to keys is a small and natural addition.

Note that doing so might require the disclosure as vulnerabilities of some past bug fixes/hardenings to MDC key formatting.

I am all for this proposal, but we'll need to double check the formatting of MDC keys and give a lower bound for the validity of the threat model.

Historical context

In apache/logging-site#10 I did propose MDC keys as untrusted input as a justification of CVE-2021-45046. However, as it turns out, that CVE resulted from:

the evaluation of MDC values,
the execution of remote classes, which is something Log4j is not supposed to do, even for trusted input.

rm5248 · 2026-06-01T13:02:13Z

rm5248
Jun 1, 2026
Collaborator

For safe vs. unsafe strings, I'll take an idea from Joel Spolsky and propose something like the following:

MDC map;
map[key_from_user] = value_from_user; // unsafe
map[SafeString("foobar")] = SafeString("baz"); // both safe strings
map[SafeString("bar")] = value_from_user; // safe key, unsafe value

The idea is that all strings in the MDC are unsafe by default, but you could wrap them in a SafeString to do whatever else it is that safe does.

Note: I'm not suggesting implementing this at the moment, as looking at Jan's comments I would tend to agree that both the keys and values should be unsafe by default. If there is a demand for safe strings in the MDC, something like the above could provide a good implementation.

0 replies

ramanathan1504 · 2026-06-01T17:13:12Z

ramanathan1504
Jun 1, 2026
Collaborator

To help illustrate the practical impact of treating Thread Context keys as trusted vs. untrusted, here is a concrete example of how key-based injection can occur in structured layouts when keys are populated dynamically (e.g., from HTTP headers):

Scenario: Logging Dynamic Headers

Suppose an application maps dynamic HTTP header names directly into the ThreadContext:

// Unsafe practice, but common in middleware
ThreadContext.put(headerName, headerValue);
logger.info("Processed request");

The Attack Payload

An attacker sends a request with a malicious HTTP header where the name contains JSON control characters:

Header Name (Key): transactionId" : "123", "role" : "admin, "dummy`
Header Value: user-value

1. If Keys are Classified as Trusted (Unescaped)

If the layout assumes keys are safe developer constants and writes them raw, the resulting JSON log line becomes corrupted:

{
  "time": "2026-06-01T12:00:00Z",
  "level": "INFO",
  "message": "Processed request",
  "transactionId" : "123", "role" : "admin", "dummy": "user-value"
}

A log parser (e.g., Elasticsearch, Splunk) will parse "role": "admin" as a separate, valid field. This allows the attacker to inject arbitrary key-value pairs into the downstream log aggregator.

2. If Keys are Classified as Untrusted (Escaped)

If the layout treats keys as untrusted and escapes them, the attack is neutralized:

{
  "time": "2026-06-01T12:00:00Z",
  "level": "INFO",
  "message": "Processed request",
  "transactionId\" : \"123\", \"role\" : \"admin\", \"dummy\": "user-value"
}

The payload remains safely trapped inside a single, albeit malformed, JSON key. No new structural fields are injected.

This example seems to align closely with the points raised by @FreeAndNil and @vy. While copying raw HTTP headers into MDC keys is discouraged, it is a realistic scenario. Treating keys as untrusted content and sanitizing them by default provides strong defense-in-depth and protects developers from these security oversights.

3 replies

ppkarwasz Jun 2, 2026
Collaborator Author

Whether we escape keys or not is orthogonal to whether they are trusted or not.

It only influences the classification of a formatting bug: it is either a normal bug or a vulnerability.

ramanathan1504 Jun 2, 2026
Collaborator

That is a very precise distinction. If we classify keys as untrusted, an escaping bug is a CVE; if trusted, it's just a normal formatting bug.

Classifying them as untrusted seems safer, as security teams and SIEM parsers will treat key-based JSON corruption as a vulnerability in the wild regardless.

ppkarwasz Jun 2, 2026
Collaborator Author

You would be surprised on how many SIEM systems don't use a structured layout: https://docs.cloud.google.com/logging/docs/agent/ops-agent/third-party

Most of those applications use a derivative of PatternLayout and the Ops Agent tries to parse them instead of recommending users to switch to a structured layout.

rgoers · 2026-06-02T13:43:14Z

rgoers
Jun 2, 2026
Collaborator

In todays world maybe my view is naive, but my approach would be to say that while Log4j supports the use of the MDC/ThreadContext all responsibility for its content belongs to the user of Log4j. While we may include them in logs, or use them to manipulate how logging is performed, Log4j simply cannot validate the content to the extent necessary. Since the user determines what keys should be present and what the appropriate values should be it is up to them to perform proper validation. If we can make that easier by providing integrations with validation frameworks where there is a way for the user to define keys and validation rules for them we should do that.

0 replies

Uh oh!

Threat model: how should Thread Context (MDC) keys be classified (trusted structural or untrusted content)? #4132

Uh oh!

ppkarwasz May 30, 2026 Collaborator

Background

The question

Points to decide

Replies: 5 comments · 6 replies

Uh oh!

rm5248 May 30, 2026 Collaborator

Uh oh!

FreeAndNil May 31, 2026 Collaborator

Uh oh!

FreeAndNil May 31, 2026 Collaborator

Uh oh!

vy Jun 1, 2026 Collaborator

Uh oh!

ppkarwasz Jun 2, 2026 Collaborator Author

Historical context

Uh oh!

rm5248 Jun 1, 2026 Collaborator

Uh oh!

ramanathan1504 Jun 1, 2026 Collaborator

Scenario: Logging Dynamic Headers

The Attack Payload

1. If Keys are Classified as Trusted (Unescaped)

2. If Keys are Classified as Untrusted (Escaped)

Uh oh!

ppkarwasz Jun 2, 2026 Collaborator Author

Uh oh!

ramanathan1504 Jun 2, 2026 Collaborator

Uh oh!

ppkarwasz Jun 2, 2026 Collaborator Author

Uh oh!

rgoers Jun 2, 2026 Collaborator

ppkarwasz
May 30, 2026
Collaborator

Replies: 5 comments 6 replies

rm5248
May 30, 2026
Collaborator

FreeAndNil May 31, 2026
Collaborator

FreeAndNil
May 31, 2026
Collaborator

vy Jun 1, 2026
Collaborator

ppkarwasz Jun 2, 2026
Collaborator Author

rm5248
Jun 1, 2026
Collaborator

ramanathan1504
Jun 1, 2026
Collaborator

ppkarwasz Jun 2, 2026
Collaborator Author

ramanathan1504 Jun 2, 2026
Collaborator

ppkarwasz Jun 2, 2026
Collaborator Author

rgoers
Jun 2, 2026
Collaborator