Skip to content

fix: MapTypeAdapterFactory rejects duplicate keys on write (mirror of #3006 read-side)#3036

Closed
andrewstellman wants to merge 2 commits into
google:mainfrom
andrewstellman:qpb/bug-002-map-duplicate-keys-write
Closed

fix: MapTypeAdapterFactory rejects duplicate keys on write (mirror of #3006 read-side)#3036
andrewstellman wants to merge 2 commits into
google:mainfrom
andrewstellman:qpb/bug-002-map-duplicate-keys-write

Conversation

@andrewstellman

Copy link
Copy Markdown
Contributor

Summary

MapTypeAdapterFactory write path silently emits JSON objects with duplicate member names when two distinct Map keys share a String.valueOf form. The read path already rejects duplicate keys (MapTypeAdapterFactory.java:191-193,204-206) — so Gson refuses to read what it is willing to write. This PR mirrors the read-side check on the write side: track emitted names in a HashSet and throw JsonSyntaxException on the second occurrence.

This is the sibling-bug counterpart to #3006 (which fixed the read-side dup-key detection when the first value is null).

Reproduction

Map<Object, String> map = new LinkedHashMap<>();
map.put(new Object() { public String toString() { return "k"; } }, "a");
map.put(new Object() { public String toString() { return "k"; } }, "b");

Gson gson = new Gson();
String json = gson.toJson(map);
// json == {"k":"a","k":"b"}   <- two distinct entries, one JSON name; invalid per RFC 8259

// Same Gson refuses to read what it just wrote:
gson.fromJson(json, new TypeToken<Map<String, String>>(){}.getType());
// throws JsonSyntaxException("duplicate key: k")

A subtler real-world version: keys of different types that share a toString form (e.g. BigDecimal("1") and a custom key whose toString returns "1").

Root cause

The non-complex map serialization path iterates map.entrySet() and emits out.name(String.valueOf(entry.getKey())) without tracking what's been emitted (MapTypeAdapterFactory.java:221-228). The read path already enforces uniqueness — it loops calling in.nextName() and rejects a collision against a Set of seen keys. The write path was never given the parallel check.

RFC 8259 §4: object member names SHOULD be unique; strict parsers reject duplicates. Gson's read path enforces this; its write path didn't.

Fix

internal/bind/MapTypeAdapterFactory.java:222-228 and :259-266 — both write paths (non-complex and complex-array-fallback) now track emitted names in a local HashSet<String> and throw JsonSyntaxException("duplicate key: " + name) on the second occurrence. Behavior matches the read-side rejection so write/read are symmetric.

Set<String> emittedNames = new HashSet<>();
for (Map.Entry<K, V> entry : map.entrySet()) {
  String name = String.valueOf(entry.getKey());
  if (!emittedNames.add(name)) {
    throw new JsonSyntaxException("duplicate key: " + name);
  }
  out.name(name);
  valueTypeAdapter.write(out, entry.getValue());
}

Testing

New regression test Bug002RegressionTest in gson/src/test/java/com/google/gson/regression/:

  • Red (before fix): two-keys-same-stringform map serializes silently to {"k":"a","k":"b"}; the round-trip read throws. Assertion that the write should also reject the duplicate fails.
  • Green (after fix): write throws JsonSyntaxException("duplicate key: k") immediately; symmetry restored.

Existing test suites pass: MapTest (54 tests), JsonTreeWriterTest (26 tests). No test in the suite exercises the duplicate-key emission case explicitly — this is novel behavior the fix introduces, but it aligns with the existing read-side contract.

Backwards compatibility note

Code that intentionally relied on the silent duplicate emission would break. We believe such code is rare-to-nonexistent because:

  1. The output is not valid JSON per RFC 8259.
  2. The same Gson instance can't read it back.
  3. The behavior is the asymmetric half of an obvious symmetry contract.

If you want to preserve the legacy behavior behind a serializeNulls-style opt-in flag, that can be added — but defaulting to the symmetric check matches what #3006 did on the read side.

Provenance

Discovered by Quality Playbook v1.5.8. The regression test was TDD-verified before this PR was authored. The bug is a write-surface parallel-implementation hazard — same class as #3006.

Related

Two distinct Map keys whose String.valueOf forms collide silently produce
a JSON object with duplicate member names — which the read path
rejects with 'duplicate key' on the next deserialize. The write/read
asymmetry violates the symmetry contract.

This commit adds the failing regression test demonstrating the
write-side dup-key emission.
The map read path already throws JsonSyntaxException('duplicate key:
...') on collisions. The write path didn't track emitted names — so
Map.entrySet iteration of two keys whose String.valueOf form collides
silently produced a JSON object with duplicate member names that the
same Gson instance would refuse to read.

Mirror the read path's check: track emitted name strings in a HashSet
and throw JsonSyntaxException on the second occurrence. Applies to both
the non-complex object path and the complex-array-fallback object path.

This is a sibling-bug counterpart to the read-side dup-key fix in
f4d371d (google#3006). Existing MapTest, JsonTreeWriterTest suites pass.
@andrewstellman andrewstellman force-pushed the qpb/bug-002-map-duplicate-keys-write branch from 950fcb5 to d782fd5 Compare June 6, 2026 05:53

@Marcono1234 Marcono1234 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

The following is just my personal opinion; I am not a direct project member. Please do not close your PR based on it; the maintainers might think differently about this.

I am not completely convinced by this; this will cause overhead for every map serialization and the case it tries to account for seems like a rare corner case.

For map deserialization it is a bit different:

  • it benefits from the currently constructed map as a way to check for duplicate keys (no extra Set needed)
  • it has a potential security impact because differences between handling of duplicate keys in different JSON libraries might be exploitable;
    therefore rejecting duplicate keys might help to some degree with this

Comment on lines +40 to +56
try {
json = gson.toJson(map, new TypeToken<Map<Object, String>>() {}.getType());
} catch (JsonSyntaxException expected) {
// Acceptable fix: throw JsonSyntaxException naming the duplicate key. Narrow match —
// message must reference "duplicate key:" followed by the literal colliding name "k",
// not merely contain the word "key" from any unrelated JSON syntax error.
assertThat(expected).hasMessageThat().containsMatch("duplicate key:\\s*k(?:\\W|$)");
return;
}

// Otherwise: gson emitted JSON without throwing. Pin the object-form serialization (not the
// complex-key array-form) so the indexOf "k": substring check is exercising the bug's locus.
assertThat(json).startsWith("{");
int first = json.indexOf("\"k\":");
assertThat(first).isAtLeast(0);
int second = json.indexOf("\"k\":", first + 1);
assertThat(second).isEqualTo(-1);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If throwing an exception is the desired and expected behavior now, then the test should probably reflect that and use assertThrows?

/** Regression test: map serialization must not silently emit duplicate JSON member names. */
public class MapTypeAdapterDuplicateKeyTest {
@Test
public void mapWriteDoesNotEmitDuplicateNames() {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be good to also cover the complex map key scenario which you changed as well:

  • a case where complex map keys are enabled, the keys serialize to a complex value (possibly identical for all keys, e.g. ["k"]), toString() is identical (e.g. "k") but the duplicate keys are not rejected
  • a case where complex map keys are enabled, but all keys serialize to a non-complex value, and duplicate keys are rejected

What do you think?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are 100% right – I hyperfocused on a corner case. Thank you! Closing this PR.

@andrewstellman

Copy link
Copy Markdown
Contributor Author

@Marcono1234 had a good point - I'm hyperfocusing on a corner case, to the detriment of the base case. Closing this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants