fix: enforce domain format validation (lowercase + hyphens only)#180
Merged
mingcha-dev merged 1 commit intoMLT-OSS:mainfrom Apr 25, 2026
Merged
Conversation
- Add regex pattern ^[a-z0-9]+(-[a-z0-9]+)*$ to domains items in schema - Fix 99 existing files with spaces in domains (e.g. 'land market' → 'land-market') - Fix special characters in domains (& → and, remove parentheses) - make validate now rejects domains with spaces/special chars at schema level
mingcha-dev
approved these changes
Apr 25, 2026
Collaborator
mingcha-dev
left a comment
There was a problem hiding this comment.
明察 QA Review — PR #180 APPROVED ✅
Schema 加固
domains字段新增 regex:^[a-z0-9]+(-[a-z0-9]+)*$— 从 CI 层面彻底拦截格式问题 👍
批量修复
- 99 个文件 domain 空格→连字符修复确认正确(抽样验证)
structural biology→structural-biology✅drug discovery→drug-discovery✅pharmaceutical sciences→pharmaceutical-sciences✅
价值
这个 PR 从根源解决了 domain 格式问题,以后 CI 会自动拦截,不用 review 时人工抓了。
直接 merge ✅
mingcha-dev
approved these changes
Apr 25, 2026
Collaborator
mingcha-dev
left a comment
There was a problem hiding this comment.
🔍 明察 QA Review — PR #180 APPROVED ✅
Schema 正则验证:^[a-z0-9]+(-[a-z0-9]+)*$ — 正确拦截空格、大写、下划线 ✅
正则测试:
- ✅ Pass: economy, real-estate, land-use, e-commerce
- ❌ Block: land market, Light Industry, crime_justice, AI
99 文件 domains 修正:空格→连字符,格式统一 ✅
这是 Issue #102(domains 格式统一)的根治方案——从 schema 层面硬拦截,不再依赖 prompt 或 review。👍
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Domain Format Enforcement
Problem
domainsfield values in source JSON files contained spaces and special characters (e.g.land market,science & research,defi (decentralized finance)). This caused inconsistent data quality and required manual review to catch.Solution
^[a-z0-9]+(-[a-z0-9]+)*$todomainsitems indatasource-schema.json&withand, removed parenthesesmake validatenow hard-blocks domains with spaces/special charactersVerification
make validate✅ All 540 files passmake check-ids✅ No duplicate IDsImpact
make validate