Skip to content

fix: add UTF-8 encoding validation before Feishu document import#86

Open
Luoqiu1 wants to merge 1 commit intoriba2534:mainfrom
Luoqiu1:main
Open

fix: add UTF-8 encoding validation before Feishu document import#86
Luoqiu1 wants to merge 1 commit intoriba2534:mainfrom
Luoqiu1:main

Conversation

@Luoqiu1
Copy link
Copy Markdown
Contributor

@Luoqiu1 Luoqiu1 commented Apr 3, 2026

Summary

  • feishu-cli-importfeishu-cli-write 技能的导入前流程中新增 UTF-8 编码防御性检查
  • 同时检测两类编码异常:U+FFFD 替换字符(已被上游替换的损坏字节)和非法 UTF-8 字节序列(原始截断字节)
  • 防止编码损坏的 Markdown 内容原样写入飞书文档,导致文档出现乱码(

Background

实际使用中遇到过大 Markdown 文件中部分中文字符出现 UTF-8 多字节截断的情况,损坏的字符被直接导入到飞书文档中显示为乱码。由于 feishu-cli doc import 本身不做编码校验,需要在技能层增加前置检查作为防御性兜底。

Changes

  • skills/feishu-cli-import/SKILL.md:在"验证文件"步骤中新增编码验证
  • skills/feishu-cli-write/SKILL.md:在"生成 Markdown"步骤后新增编码验证

验证命令:

python3 -c "d=open('<file.md>','rb').read(); assert b'\xef\xbf\xbd' not in d, 'U+FFFD found'; d.decode('utf-8')"

Test results

三项测试全部通过:

Test 1: 包含 U+FFFD 替换字符的文件

$ printf '# 标题\n电子\xef\xbf\xbd\xef\xbf\xbd格模块\n' > test_ufffd.md
$ python3 -c "d=open('test_ufffd.md','rb').read(); assert b'\xef\xbf\xbd' not in d, 'U+FFFD found'; d.decode('utf-8')"
AssertionError: U+FFFD found  # exit 1 ✅ 检出

Test 2: 包含截断 UTF-8 字节的文件

$ printf '\xe8\xa1' > test_truncated.md
$ python3 -c "d=open('test_truncated.md','rb').read(); assert b'\xef\xbf\xbd' not in d, 'U+FFFD found'; d.decode('utf-8')"
UnicodeDecodeError: 'utf-8' codec can't decode bytes in position 0-1: unexpected end of data  # exit 1 ✅ 检出

Test 3: 正常 UTF-8 文件

$ printf '# 标题\n电子表格模块\n中文内容正常\n' > test_normal.md
$ python3 -c "d=open('test_normal.md','rb').read(); assert b'\xef\xbf\xbd' not in d, 'U+FFFD found'; d.decode('utf-8')"
# exit 0 ✅ 正常通过

🤖 Generated with Claude Code

Add a defensive encoding check to feishu-cli-import and feishu-cli-write
skills. The check detects both U+FFFD replacement characters and invalid
UTF-8 byte sequences in Markdown files before importing to Feishu,
preventing corrupted characters from flowing into documents.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant