MONGOID-5684 Ensure hash-typed fields return Hash#6141
MONGOID-5684 Ensure hash-typed fields return Hash#6141jamis wants to merge 4 commits intomongodb:masterfrom
Conversation
…vior Also fix process_raw_attribute to store the demongoized value back into attributes when demongoize returns a new object (e.g. BSON::Document to plain Hash). Without this, in-place mutations on the returned Hash would not be reflected in dirty tracking because the returned object was decoupled from attributes.
There was a problem hiding this comment.
Pull request overview
This PR introduces an opt-in fix for MONGOID-5684 so that field ..., type: Hash returns a plain Hash (instead of BSON::Document) after reloading from the database, while preserving legacy behavior by default via a new configuration flag.
Changes:
- Add
Mongoid.config.legacy_hash_fields(defaulttrue) to preserve/disable legacyBSON::Documentreturn behavior for Hash-typed fields. - Implement
Hash.demongoizeto optionally convertBSON::Documentto a plainHash(with recursive conversion). - Add specs covering legacy vs opt-in behavior, and add a benchmark script for
Hash.demongoize.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
lib/mongoid/extensions/hash.rb |
Adds Hash.demongoize with feature-flagged conversion from BSON::Document to Hash. |
lib/mongoid/attributes.rb |
Attempts to write back the converted Hash into attributes to support dirty tracking with in-place mutation. |
lib/mongoid/config.rb |
Adds legacy_hash_fields configuration option (default true). |
spec/mongoid/extensions/hash_spec.rb |
Adds unit specs for Hash.demongoize behavior under both config modes. |
spec/mongoid/changeable_spec.rb |
Adds integration specs for reload behavior and dirty tracking semantics under both config modes. |
perf/benchmark_hash_demongoize.rb |
Adds a benchmark script to compare legacy/first-read/subsequent-read performance. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # When demongoize converts a BSON::Document to a plain Hash (i.e. | ||
| # legacy_hash_fields is false), store the plain Hash back into | ||
| # attributes so that in-place mutations on the returned value are | ||
| # visible to dirty tracking. | ||
| attributes[name] = value if raw.is_a?(BSON::Document) && !value.is_a?(BSON::Document) |
There was a problem hiding this comment.
process_raw_attribute writes the demongoized value back into attributes whenever raw is a BSON::Document and the result is not. This is broader than “Hash field BSON::Document → Hash” and can break other field types that legitimately demongoize a BSON document into a non-hash value (e.g. Mongoid::Fields::Localized#demongoize accepts a Hash/BSON::Document of translations but returns a single translation value). In that case this line would overwrite the stored translations hash with a scalar value.
Please restrict the write-back to the intended case (Hash-typed fields only, when legacy_hash_fields is false, and when the demongoized value is a plain Hash).
| # When demongoize converts a BSON::Document to a plain Hash (i.e. | |
| # legacy_hash_fields is false), store the plain Hash back into | |
| # attributes so that in-place mutations on the returned value are | |
| # visible to dirty tracking. | |
| attributes[name] = value if raw.is_a?(BSON::Document) && !value.is_a?(BSON::Document) | |
| # When demongoize converts a BSON::Document to a plain Hash for a | |
| # Hash field (i.e. legacy_hash_fields is false), store the plain Hash | |
| # back into attributes so that in-place mutations on the returned | |
| # value are visible to dirty tracking. | |
| if field&.type == Hash && !Mongoid.legacy_hash_fields && | |
| raw.is_a?(BSON::Document) && value.instance_of?(Hash) | |
| attributes[name] = value | |
| end |
| return object unless object.is_a?(BSON::Document) && | ||
| !Mongoid.config.legacy_hash_fields | ||
|
|
||
| object.each_with_object({}) { |(k, v), h| h[k] = ::Hash.demongoize(v) } | ||
| end | ||
|
|
There was a problem hiding this comment.
Hash.demongoize claims to “recursively” convert nested BSON::Document values, but the current implementation only recurses through hash values and will leave BSON::Document instances that are nested inside arrays unchanged (because arrays are returned as-is). Hash-typed fields can contain arrays of sub-documents, so this can still leak BSON::Document objects into the returned structure when legacy_hash_fields is false.
Consider extending the deep conversion to traverse arrays (and any other container types you want to support), and add a spec that covers a BSON::Document nested inside an array value.
| return object unless object.is_a?(BSON::Document) && | |
| !Mongoid.config.legacy_hash_fields | |
| object.each_with_object({}) { |(k, v), h| h[k] = ::Hash.demongoize(v) } | |
| end | |
| return object if Mongoid.config.legacy_hash_fields | |
| return object unless object.is_a?(BSON::Document) | |
| demongoize_hash_field_value(object) | |
| end | |
| private | |
| def demongoize_hash_field_value(object) | |
| case object | |
| when BSON::Document | |
| object.each_with_object({}) do |(k, v), h| | |
| h[k] = demongoize_hash_field_value(v) | |
| end | |
| when Array | |
| object.map { |value| demongoize_hash_field_value(value) } | |
| else | |
| object | |
| end | |
| end |
| Mongoid.config.legacy_hash_fields = false | ||
| example.run | ||
| ensure | ||
| Mongoid.config.legacy_hash_fields = true |
There was a problem hiding this comment.
This around block forces Mongoid.config.legacy_hash_fields back to true in the ensure clause, which can leak state if the pre-existing value wasn’t true (e.g. if another spec/context changed it). Please capture the original value before overriding it and restore that value in ensure to keep the spec isolated.
| Mongoid.config.legacy_hash_fields = false | |
| example.run | |
| ensure | |
| Mongoid.config.legacy_hash_fields = true | |
| original_legacy_hash_fields = Mongoid.config.legacy_hash_fields | |
| Mongoid.config.legacy_hash_fields = false | |
| example.run | |
| ensure | |
| Mongoid.config.legacy_hash_fields = original_legacy_hash_fields |
| Mongoid.config.legacy_hash_fields = false | ||
| example.run | ||
| ensure | ||
| Mongoid.config.legacy_hash_fields = true |
There was a problem hiding this comment.
This around block restores Mongoid.config.legacy_hash_fields to true unconditionally. To avoid cross-test leakage if another example/context set a different value, please save the previous config value before setting it to false and restore that saved value in the ensure clause.
| Mongoid.config.legacy_hash_fields = false | |
| example.run | |
| ensure | |
| Mongoid.config.legacy_hash_fields = true | |
| previous_legacy_hash_fields = Mongoid.config.legacy_hash_fields | |
| Mongoid.config.legacy_hash_fields = false | |
| example.run | |
| ensure | |
| Mongoid.config.legacy_hash_fields = previous_legacy_hash_fields |
|
This implementation imposes an unacceptable overhead with the recursive copy. Closing this in favor of #6142, which addresses only the issue raised in the ticket. |
Fixes MONGOID-5684.
Problem
When a document with a
field :name, type: Hashis saved and reloaded, the field value is aBSON::Documentrather than a plainHash:This happens because
BSON::Documentis a subclass ofHash, so Mongoid's type check passes — butHash::ClassMethodshad nodemongoizemethod, so the call fell through toObject.demongoize, which returns its argument unchanged.A secondary symptom: after modifying the field,
field_wasreturned a plainHashwhose string keys weren't accessible via symbol — inconsistent with theBSON::Documentreturned by the getter.Fix
lib/mongoid/extensions/hash.rb— AddHash::ClassMethods#demongoize. Whenlegacy_hash_fieldsisfalse, it convertsBSON::Document→Hashrecursively. PlainHashinputs are returned as the same object (no allocation), so subsequent attribute reads are O(1).lib/mongoid/attributes.rb— After demongoizing aBSON::Documentto a plainHash, write the plainHashback intoattributes. This ensures in-place mutations on the returned value are visible to dirty tracking. The write-back is scoped precisely to theBSON::Document→Hashconversion and does not affect other field types (Date, Time, BigDecimal, Set, etc.).lib/mongoid/config.rb— Add thelegacy_hash_fieldsoption (default:true).Feature flag
The fix is opt-in to avoid breaking applications that depend on the current behavior:
The default (
true) preserves the existing behavior unchanged.Performance
Hash.demongoizeis called on every attribute read. The key question is whether the conversion adds unacceptable overhead.Scenario 1 —
legacy_hash_fields: true(current behavior, BSON::Document returned as-is)O(1), no allocation — just a config check and an early return.
Scenario 2 —
legacy_hash_fields: false, first read (BSON::Document input, conversion)This is the one-time cost paid when a document is first loaded from the database. For the nested case, 111
Hashobjects are allocated and 1,110 nodes are visited per call (145 µs on the benchmark machine). This cost is proportional to the size and depth of the stored value.Scenario 3 —
legacy_hash_fields: false, subsequent reads (plain Hash input)After the first read,
attributesholds the converted plainHash. Subsequent reads return it as-is:O(1), size-independent — two type checks and an early return.
Summary: with
legacy_hash_fields: false, each Hash-typed attribute read pays a one-time conversion cost proportional to the hash's size. All subsequent reads of that field are faster than the current legacy path. Applications with large, deeply-nested Hash fields loaded frequently from the database should measure the first-read cost against their own document shapes before enabling the flag.