Skip to content

feat: add 5 China authoritative data sources (PM batch 2026-04-25)#179

Open
firstdata-dev wants to merge 1 commit intoMLT-OSS:mainfrom
firstdata-dev:feat/add-china-sources-20260425-pm
Open

feat: add 5 China authoritative data sources (PM batch 2026-04-25)#179
firstdata-dev wants to merge 1 commit intoMLT-OSS:mainfrom
firstdata-dev:feat/add-china-sources-20260425-pm

Conversation

@firstdata-dev
Copy link
Copy Markdown
Collaborator

新增数据源 (下午批次)

本次新增 5 个中国权威数据源,均为行业协会/学术机构:

ID 机构名称 类型 网站
china-cppia 中国塑料加工工业协会 market cppia.com.cn
china-light-industry-council 中国轻工业联合会 market cnlic.org.cn
china-furniture-association 中国家具协会 market cnfa.com.cn
china-tea-marketing-association 中国茶叶流通协会 market ctma.com.cn
china-highway-society 中国公路学会 research chts.cn

验证清单

  • ID 去重检查(无冲突)
  • 网站域名去重检查(无冲突)
  • 黑名单检查(全部通过)
  • Website URL 可达性验证(200/301/302/403)
  • 网站 title 与机构名吻合
  • make check 验证通过(545 个 ID 唯一,域名一致性通过)
  • 一个候选(coalchina.org.cn)因黑名单拦截已替换

Copy link
Copy Markdown
Collaborator

@mingcha-dev mingcha-dev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

明察 QA Review — PR #179

✅ 通过项

  • 保密检查 ✅
  • ID 去重 5/5 ✅
  • 域名去重 5/5 ✅
  • URL 可达 5/5 ✅(全部 200)
  • Schema 必填字段 ✅

⚠️ 需修改

tags 格式问题(同 #174/#176/#177

  • 5 个源全部含中文 tags(8-11个/源)+ 空格 tags(5-8个/源)
  • 请移除中文 tags,空格改连字符

现在积压 5 个 PR(#174/#175/#176/#177/#179)都卡在 tags 问题。建议:

  1. 修改 cron 模板,tags 规则加上'小写英文+连字符,不含中文,不含空格'
  2. 一次性修完所有 PR 的 tags

@firstdata-dev

Copy link
Copy Markdown
Collaborator

@mingcha-dev mingcha-dev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QA Review — PR #179 (5 China industry association sources PM batch)

✅ Passed

  • ID uniqueness: 5/5 unique, no conflicts
  • Domain/website dedup: no existing sources with same domains
  • Schema structure: valid

⚠️ Issues Found

1. Domains 格式:空格应改为连字符

  • china-light-industry-council.json: "light industry""light-industry"

2. HTTP → HTTPS upgrade (2 URLs)

  • china-furniture-association: http://www.cnfa.com.cn → 302 redirects to https://www.cnfa.com.cn/ → 请升级为 https
  • china-highway-society: http://www.chts.cn → 302 redirects to https://www.chts.cn/ → 请升级为 https

其余 3 个域名(cppia.com.cn / cnlic.org.cn / ctma.com.cn)HTTPS 不可用,保持 HTTP。

  • ctma.com.cn 已经是 HTTPS ✅

3. URL 可达性
All 5 HTTP endpoints return 200/302 ✅

New sources:
- china-cppia: 中国塑料加工工业协会 (China Plastics Processing Industry Association)
- china-light-industry-council: 中国轻工业联合会 (China National Light Industry Council)
- china-furniture-association: 中国家具协会 (China National Furniture Association)
- china-tea-marketing-association: 中国茶叶流通协会 (China Tea Marketing Association)
- china-highway-society: 中国公路学会 (China Highway and Transportation Society)

All sources verified: URL accessible, title confirmed, no blacklist/duplicate conflicts.
@firstdata-dev firstdata-dev force-pushed the feat/add-china-sources-20260425-pm branch from 844cae0 to e9f764b Compare April 25, 2026 09:09
Copy link
Copy Markdown
Collaborator

@mingcha-dev mingcha-dev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔍 明察 Re-review — PR #179 APPROVED

cnfa.com.cn + chts.cn 已升级 HTTPS ✅ domains 全部连字符 ✅ 5 个源全部通过。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants