Fix a bug that Gem::YAMLSerializer.load ignores quotation#9552
Conversation
There was a problem hiding this comment.
Pull request overview
Fixes Gem::YAMLSerializer.load incorrectly treating quoted scalars containing : as mappings (notably within requirements entries in .gemspec YAML), so metadata in .gem files is deserialized correctly.
Changes:
- Add handling in the YAMLSerializer parser intended to recognize single/double-quoted scalars and parse them as scalars instead of mappings.
- Add a regression test ensuring quoted
requirementsentries load as an array of strings.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
lib/rubygems/yaml_serializer.rb |
Adjusts node-type detection to treat quoted lines as scalars to avoid : triggering mapping parsing. |
test/rubygems/test_gem_safe_yaml.rb |
Adds a test case for quoted requirements values deserializing to strings. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if stripped.start_with?("- ") || stripped == "-" | ||
| parse_sequence(indent, anchor) | ||
| elsif stripped.start_with?("\"") and stripped.end_with?("\"") | ||
| parse_plain_scalar(indent, anchor) | ||
| elsif stripped.start_with?("'") and stripped.end_with?("'") | ||
| parse_plain_scalar(indent, anchor) | ||
| elsif stripped =~ MAPPING_KEY_RE && !stripped.start_with?("!ruby/object:") |
| elsif stripped.start_with?("\"") and stripped.end_with?("\"") | ||
| parse_plain_scalar(indent, anchor) | ||
| elsif stripped.start_with?("'") and stripped.end_with?("'") |
| def test_requirement_quote | ||
| yaml = <<~YAML | ||
| requirements: | ||
| - "system: arrow-glib>=25.0.0: amazon_linux: arrow-glib-devel" | ||
| - 'system: arrow-glib>=25.0.0: fedora: libarrow-glib-devel' | ||
| YAML | ||
|
|
||
| assert_equal([ | ||
| "system: arrow-glib>=25.0.0: amazon_linux: arrow-glib-devel", | ||
| "system: arrow-glib>=25.0.0: fedora: libarrow-glib-devel", | ||
| ], | ||
| yaml_load(yaml)["requirements"]) | ||
| end |
|
Thanks, Should we cover This change introduce |
|
Oh, I didn't notice the case. We may need to restructure the current |
I checked the current implementation. We don't need to care about this case because:
rubygems/lib/rubygems/yaml_serializer.rb Lines 150 to 170 in 5c535b0
BTW, users can specify mapping only in |
`"a: b"` must be processed as a string value (`a: b`) not a map
value (`{"a" => "b"}`).
|
I've added a comment for the note. |
| elsif stripped.start_with?("\"") && stripped.end_with?("\"") | ||
| # We don't need to care about the following case here: | ||
| # 1. "value with comment" # ... | ||
| # 2. "key": "value" | ||
| # | ||
| # 1. must not happen because YAMLSerializer doesn't emit any | ||
| # comment. YAMLSerializer parses only YAML that is generated | ||
| # by YAMLSerializer. | ||
| # | ||
| # 2. must not happen because #parse_node isn't used non | ||
| # top-level mapping. Non top-level mapping always uses | ||
| # #parse_mapping. Top-level mapping never use the '"key": | ||
| # "value"' form because all top-level keys | ||
| # ("!ruby/object:Gem::Specification"'s keys) are known and | ||
| # #emit_specification doesn't quote anything. | ||
| parse_plain_scalar(indent, anchor) | ||
| elsif stripped.start_with?("'") && stripped.end_with?("'") | ||
| # See also the above note for double quotation. | ||
| parse_plain_scalar(indent, anchor) | ||
| elsif stripped =~ MAPPING_KEY_RE && !stripped.start_with?("!ruby/object:") | ||
| parse_mapping(indent, anchor) |
| # 2. must not happen because #parse_node isn't used non | ||
| # top-level mapping. Non top-level mapping always uses | ||
| # #parse_mapping. Top-level mapping never use the '"key": | ||
| # "value"' form because all top-level keys | ||
| # ("!ruby/object:Gem::Specification"'s keys) are known and | ||
| # #emit_specification doesn't quote anything. |
Fix #9551
What was the end-user or developer problem that led to this PR?
.gemspecis serialized to YAML (metadata.gz) in.gem.metadata.gzmay use double quote for string value like:Gem::YAMLSerializer.loadignores the quotation and parses it as a map:What is your fix for the problem, implemented in this PR?
Recognize quotation and parse quoted value as a string value.
Make sure the following tasks are checked