fix(skills): parse YAML block scalars in SKILL.md frontmatter#1908
Conversation
There was a problem hiding this comment.
Code Review
This pull request updates the skill frontmatter parser to support YAML block scalars, including folded ('>') and literal ('|') styles along with chomping indicators. It also adds several unit tests to validate these features. Review feedback highlights two issues regarding YAML specification compliance: the loss of relative indentation due to excessive trimming and the incorrect implementation of the 'clip' chomping mode, which should ensure a single trailing newline for non-empty blocks.
| let mut block_lines: Vec<&str> = Vec::new(); | ||
| i += 1; | ||
| while i < lines.len() { | ||
| let raw_line = lines[i]; | ||
| if raw_line.trim().is_empty() { | ||
| // Empty lines are part of the block | ||
| block_lines.push(""); | ||
| i += 1; | ||
| continue; | ||
| } | ||
| let line_indent = raw_line.len() - raw_line.trim_start().len(); | ||
| if line_indent > base_indent { | ||
| block_lines.push(raw_line.trim()); | ||
| i += 1; | ||
| } else { | ||
| break; | ||
| } | ||
| } |
There was a problem hiding this comment.
The current implementation of block scalar parsing loses relative indentation within the block because it calls raw_line.trim() on every line. According to the YAML specification, a block scalar has a determined indentation level (usually from the first non-empty line), and only that amount of leading whitespace should be stripped from each line. This ensures that nested structures like lists or indented code blocks within the description are preserved.
| let mut block_lines: Vec<&str> = Vec::new(); | |
| i += 1; | |
| while i < lines.len() { | |
| let raw_line = lines[i]; | |
| if raw_line.trim().is_empty() { | |
| // Empty lines are part of the block | |
| block_lines.push(""); | |
| i += 1; | |
| continue; | |
| } | |
| let line_indent = raw_line.len() - raw_line.trim_start().len(); | |
| if line_indent > base_indent { | |
| block_lines.push(raw_line.trim()); | |
| i += 1; | |
| } else { | |
| break; | |
| } | |
| } | |
| let mut block_lines: Vec<&str> = Vec::new(); | |
| let mut block_indent = None; | |
| i += 1; | |
| while i < lines.len() { | |
| let raw_line = lines[i]; | |
| let trimmed = raw_line.trim_start(); | |
| if trimmed.is_empty() { | |
| block_lines.push(""); | |
| i += 1; | |
| continue; | |
| } | |
| let line_indent = raw_line.len() - trimmed.len(); | |
| if line_indent > base_indent { | |
| let b_indent = *block_indent.get_or_insert(line_indent); | |
| block_lines.push(raw_line.get(b_indent..).unwrap_or(trimmed).trim_end()); | |
| i += 1; | |
| } else { | |
| break; | |
| } | |
| } |
| // clip: keep at most one trailing empty line | ||
| let mut lines = block_lines; | ||
| while lines.len() >= 2 | ||
| && lines[lines.len() - 1].is_empty() | ||
| && lines[lines.len() - 2].is_empty() | ||
| { | ||
| lines.pop(); | ||
| } | ||
| lines | ||
| }; |
There was a problem hiding this comment.
The clip chomping mode (the default for > and |) should ensure the resulting string ends with exactly one newline character if the block is not empty. The current implementation only preserves a trailing newline if the input block already ended with one or more empty lines. If the block ends with a non-empty line, the current code produces a string without a trailing newline, which deviates from the YAML specification.
1fd6f32 to
32fcfc4
Compare
32fcfc4 to
6041a60
Compare
|
Maybe use a yaml parser is better. but for now, is ok. |
|
Stewardship note: linked #1907 into v0.8.44 and left it open until this lands. Checks are green right now, and the review focus should be parser scope/backward compatibility rather than UI text only, since the parsed description feeds /skills, load_skill, and model prompt context. Thanks @zlh124 for the clear repro and tests. |
谢谢你,鲸鱼兄弟! |
- Update @reidliu41 entry to add user-message transcript highlight (#1995) - Update @zlh124 entry to add SKILL.md YAML block-scalar parser (#1908) - Update @h3c-hexin entry to add file_search cancellation report (#2044) - Add @lpeng1711694086-lang for user-message TUI contrast feature request (#1672)
- Update @reidliu41 entry to add user-message transcript highlight (#1995) - Update @zlh124 entry to add SKILL.md YAML block-scalar parser (#1908) - Update @h3c-hexin entry to add file_search cancellation report (#2044) - Add @lpeng1711694086-lang for user-message TUI contrast feature request (#1672)
Summary
Fixes YAML block scalar parsing in SKILL.md frontmatter descriptions. The parser previously only handled single-line quoted/unquoted values — multi-line descriptions using
>or|were silently parsed as the literal indicator string.Changes
crates/tui/src/skills/mod.rs— extendedSkillRegistry::parse_skill:>,>-,>+,|,|-,|+indicatorsstrip(-): removes all trailing blank linesclip(default): keeps at most one trailing newlinekeep(+): preserves all trailing blank lines>): non-empty lines joined with spaces; empty lines become paragraph breaks (\n)|): lines joined with newlinesTests
13 unit tests added covering all six block-scalar variants plus chomping edge cases:
parse_skill_folded_block_scalar>folded block scalarparse_skill_folded_strip_block_scalar>-folded + strip via render pipelineparse_skill_literal_block_scalar|literal block scalarparse_skill_direct_folded_result>assertion onSkillstructparse_skill_strip_chomp_strips_trailing_empties>-strips trailing emptiesparse_skill_keep_chomp_preserves_trailing_empties>+preserves trailing emptiesparse_skill_clip_chomp_clips_excess_trailing_empties>clip with 3 trailing emptiesparse_skill_clip_chomp_no_trailing_empties>clip with 0 trailing emptiesparse_skill_clip_chomp_one_trailing_empty>clip with 1 trailing emptyparse_skill_strip_vs_keep_trailing>-vs>+comparisonparse_skill_literal_strip_strips_trailing_newlines|-literal stripparse_skill_literal_keep_preserves_trailing_newlines|+literal keepparse_skill_single_line_description_still_worksAll pass:
Related issue
Fixes: #1907