fix: MapTypeAdapterFactory rejects duplicate keys on write (mirror of #3006 read-side)#3036
Conversation
Two distinct Map keys whose String.valueOf forms collide silently produce a JSON object with duplicate member names — which the read path rejects with 'duplicate key' on the next deserialize. The write/read asymmetry violates the symmetry contract. This commit adds the failing regression test demonstrating the write-side dup-key emission.
The map read path already throws JsonSyntaxException('duplicate key:
...') on collisions. The write path didn't track emitted names — so
Map.entrySet iteration of two keys whose String.valueOf form collides
silently produced a JSON object with duplicate member names that the
same Gson instance would refuse to read.
Mirror the read path's check: track emitted name strings in a HashSet
and throw JsonSyntaxException on the second occurrence. Applies to both
the non-complex object path and the complex-array-fallback object path.
This is a sibling-bug counterpart to the read-side dup-key fix in
f4d371d (google#3006). Existing MapTest, JsonTreeWriterTest suites pass.
950fcb5 to
d782fd5
Compare
Marcono1234
left a comment
There was a problem hiding this comment.
Thanks!
The following is just my personal opinion; I am not a direct project member. Please do not close your PR based on it; the maintainers might think differently about this.
I am not completely convinced by this; this will cause overhead for every map serialization and the case it tries to account for seems like a rare corner case.
For map deserialization it is a bit different:
- it benefits from the currently constructed map as a way to check for duplicate keys (no extra
Setneeded) - it has a potential security impact because differences between handling of duplicate keys in different JSON libraries might be exploitable;
therefore rejecting duplicate keys might help to some degree with this
| try { | ||
| json = gson.toJson(map, new TypeToken<Map<Object, String>>() {}.getType()); | ||
| } catch (JsonSyntaxException expected) { | ||
| // Acceptable fix: throw JsonSyntaxException naming the duplicate key. Narrow match — | ||
| // message must reference "duplicate key:" followed by the literal colliding name "k", | ||
| // not merely contain the word "key" from any unrelated JSON syntax error. | ||
| assertThat(expected).hasMessageThat().containsMatch("duplicate key:\\s*k(?:\\W|$)"); | ||
| return; | ||
| } | ||
|
|
||
| // Otherwise: gson emitted JSON without throwing. Pin the object-form serialization (not the | ||
| // complex-key array-form) so the indexOf "k": substring check is exercising the bug's locus. | ||
| assertThat(json).startsWith("{"); | ||
| int first = json.indexOf("\"k\":"); | ||
| assertThat(first).isAtLeast(0); | ||
| int second = json.indexOf("\"k\":", first + 1); | ||
| assertThat(second).isEqualTo(-1); |
There was a problem hiding this comment.
If throwing an exception is the desired and expected behavior now, then the test should probably reflect that and use assertThrows?
| /** Regression test: map serialization must not silently emit duplicate JSON member names. */ | ||
| public class MapTypeAdapterDuplicateKeyTest { | ||
| @Test | ||
| public void mapWriteDoesNotEmitDuplicateNames() { |
There was a problem hiding this comment.
Might be good to also cover the complex map key scenario which you changed as well:
- a case where complex map keys are enabled, the keys serialize to a complex value (possibly identical for all keys, e.g.
["k"]),toString()is identical (e.g."k") but the duplicate keys are not rejected - a case where complex map keys are enabled, but all keys serialize to a non-complex value, and duplicate keys are rejected
What do you think?
There was a problem hiding this comment.
You are 100% right – I hyperfocused on a corner case. Thank you! Closing this PR.
|
@Marcono1234 had a good point - I'm hyperfocusing on a corner case, to the detriment of the base case. Closing this PR. |
Summary
MapTypeAdapterFactorywrite path silently emits JSON objects with duplicate member names when two distinctMapkeys share aString.valueOfform. The read path already rejects duplicate keys (MapTypeAdapterFactory.java:191-193,204-206) — so Gson refuses to read what it is willing to write. This PR mirrors the read-side check on the write side: track emitted names in aHashSetand throwJsonSyntaxExceptionon the second occurrence.This is the sibling-bug counterpart to #3006 (which fixed the read-side dup-key detection when the first value is null).
Reproduction
A subtler real-world version: keys of different types that share a
toStringform (e.g.BigDecimal("1")and a custom key whosetoStringreturns"1").Root cause
The non-complex map serialization path iterates
map.entrySet()and emitsout.name(String.valueOf(entry.getKey()))without tracking what's been emitted (MapTypeAdapterFactory.java:221-228). The read path already enforces uniqueness — it loops callingin.nextName()and rejects a collision against aSetof seen keys. The write path was never given the parallel check.RFC 8259 §4: object member names SHOULD be unique; strict parsers reject duplicates. Gson's read path enforces this; its write path didn't.
Fix
internal/bind/MapTypeAdapterFactory.java:222-228and:259-266— both write paths (non-complex and complex-array-fallback) now track emitted names in a localHashSet<String>and throwJsonSyntaxException("duplicate key: " + name)on the second occurrence. Behavior matches the read-side rejection so write/read are symmetric.Testing
New regression test
Bug002RegressionTestingson/src/test/java/com/google/gson/regression/:{"k":"a","k":"b"}; the round-trip read throws. Assertion that the write should also reject the duplicate fails.JsonSyntaxException("duplicate key: k")immediately; symmetry restored.Existing test suites pass:
MapTest(54 tests),JsonTreeWriterTest(26 tests). No test in the suite exercises the duplicate-key emission case explicitly — this is novel behavior the fix introduces, but it aligns with the existing read-side contract.Backwards compatibility note
Code that intentionally relied on the silent duplicate emission would break. We believe such code is rare-to-nonexistent because:
If you want to preserve the legacy behavior behind a
serializeNulls-style opt-in flag, that can be added — but defaulting to the symmetric check matches what #3006 did on the read side.Provenance
Discovered by Quality Playbook v1.5.8. The regression test was TDD-verified before this PR was authored. The bug is a write-surface parallel-implementation hazard — same class as #3006.
Related