Summary
The external_references field in the component_basic scoring category always scores as missing (not present), even when the generated BOM contains a fully populated externalReferences array with multiple entries. This causes every model to lose ~2.9 points out of 100 regardless of how complete their metadata is.
The root cause is two independent bugs in how the field is detected:
- Incorrect jsonpath in field_registry.json: The path is
$.component.externalReferences (singular component) but the CycloneDX BOM structure uses $.components[0].externalReferences (plural components with array index).
- Snake_case/camelCase mismatch in fallback checker: The fallback
check_field_in_aibom() checks if "external_references" in component but the CycloneDX key is "externalReferences" (camelCase).
Environment
- aibom-generator version: v1.0.2 (commit
67829cb)
- Python: 3.12
- OS: Linux (Ubuntu)
Steps to Reproduce
- Run the AIBOM generator against any HuggingFace model that has external references:
python3 -m src.cli "rockCO78/crosswalk-v7c" --output /tmp/test_aibom.json --verbose
-
Observe the output shows Component Basic: 17.1/20 instead of 20/20.
-
Inspect the generated BOM to confirm externalReferences IS present:
import json
bom = json.load(open("/tmp/test_aibom.json"))
comp = bom["components"][0]
# The field EXISTS in the BOM under the correct CycloneDX key:
print("externalReferences" in comp) # True
print(len(comp["externalReferences"])) # 5 entries
# But the scorer looks for the wrong key:
print("external_references" in comp) # False
- Verify the BOM top-level structure uses
components (plural), not component:
print("component" in bom) # False
print("components" in bom) # True
Root Cause Analysis
Bug 1: Incorrect jsonpath in field_registry.json
File: src/models/field_registry.json
The external_references field definition uses:
{
"external_references": {
"category": "component_basic",
"jsonpath": "$.component.externalReferences",
...
}
}
The jsonpath $.component.externalReferences navigates to bom["component"]["externalReferences"], but the CycloneDX BOM structure (both 1.6 and 1.7) uses bom["components"][0]["externalReferences"] -- note the plural components with an array index.
For comparison, the other component_basic fields all use the correct plural path:
| Field |
jsonpath |
| name |
$.components[0].name |
| type |
$.components[0].type |
| component_version |
$.components[0].version |
| purl |
$.components[0].purl |
| description |
$.components[0].description |
| licenses |
$.components[0].licenses |
| external_references |
$.component.externalReferences (incorrect) |
The inconsistency is clear: external_references is the only field using singular $.component instead of plural $.components[0].
This causes FieldRegistryManager.detect_field_presence() -> _get_nested_value() to fail at line 258-260 of src/models/registry.py:
if isinstance(current, dict) and part in current:
current = current[part]
else:
return False, None # <-- hits this because bom["component"] doesn't exist
Bug 2: Snake_case field name vs camelCase BOM key in fallback checker
File: src/models/scoring.py, lines 93-98
When the jsonpath-based detection fails (bug 1), the scorer falls back to check_field_in_aibom(). The relevant code:
# Line 93-98 of scoring.py
components = aibom.get("components", [])
if components:
component = components[0]
if field in component: # field = "external_references"
return True # BOM key = "externalReferences" -- no match
The field registry names this field external_references (snake_case), but CycloneDX uses externalReferences (camelCase). The if field in component check performs a literal key lookup, so "external_references" in {"externalReferences": [...]} returns False.
Other fields avoid this problem because their registry names match their BOM keys exactly (e.g., name, type, purl, description, licenses). The component_version field also has a name mismatch (component_version vs version), but it is rescued by the jsonpath-based detection (bug 1 doesn't affect it because its jsonpath $.components[0].version is correct).
Suggested Fix
Fix 1 (field_registry.json): Change the jsonpath from singular to plural:
- "jsonpath": "$.component.externalReferences",
+ "jsonpath": "$.components[0].externalReferences",
This alone should fix the scoring because the enhanced checker (check_field_with_enhanced_results) tries the jsonpath-based detection first (line 154-158 of scoring.py), and if that succeeds, it never reaches the fallback.
Fix 2 (scoring.py, defense-in-depth): Add a camelCase alias check in the fallback, or normalize field names before lookup:
# Option A: explicit alias map
FIELD_ALIASES = {
"external_references": "externalReferences",
"component_version": "version",
}
# In check_field_in_aibom(), line 97:
field_key = FIELD_ALIASES.get(field, field)
if field_key in component:
return True
Fix 1 is sufficient on its own. Fix 2 provides defense-in-depth against similar issues in future field additions.
Impact
- Every model scored by the AIBOM generator loses ~2.86 points (1/7 * 20) in the
component_basic category, even when the BOM correctly contains external references.
- This makes it impossible to achieve 100/100 completeness.
- The issue affects both CycloneDX 1.6 and 1.7 output since both use the
components (plural) array structure.
Additional Context
I discovered this while publishing a model (rockCO78/crosswalk-v7c) and maximizing the AIBOM completeness score. The model card covers all 35 non-GGUF fields in the registry, achieving 97.1/100 -- with the remaining 2.9 points lost entirely to this bug.
Scoring output:
Completeness Score: 97.1/100
Section Breakdown:
- Required Fields: 20/20
- Metadata: 20/20
- Component Basic: 17.1/20 <-- should be 20/20
- Component Model Card: 30/30
- External References: 10/10
Summary
The
external_referencesfield in thecomponent_basicscoring category always scores as missing (not present), even when the generated BOM contains a fully populatedexternalReferencesarray with multiple entries. This causes every model to lose ~2.9 points out of 100 regardless of how complete their metadata is.The root cause is two independent bugs in how the field is detected:
$.component.externalReferences(singularcomponent) but the CycloneDX BOM structure uses$.components[0].externalReferences(pluralcomponentswith array index).check_field_in_aibom()checksif "external_references" in componentbut the CycloneDX key is"externalReferences"(camelCase).Environment
67829cb)Steps to Reproduce
python3 -m src.cli "rockCO78/crosswalk-v7c" --output /tmp/test_aibom.json --verboseObserve the output shows
Component Basic: 17.1/20instead of20/20.Inspect the generated BOM to confirm
externalReferencesIS present:components(plural), notcomponent:Root Cause Analysis
Bug 1: Incorrect jsonpath in field_registry.json
File:
src/models/field_registry.jsonThe
external_referencesfield definition uses:{ "external_references": { "category": "component_basic", "jsonpath": "$.component.externalReferences", ... } }The jsonpath
$.component.externalReferencesnavigates tobom["component"]["externalReferences"], but the CycloneDX BOM structure (both 1.6 and 1.7) usesbom["components"][0]["externalReferences"]-- note the pluralcomponentswith an array index.For comparison, the other
component_basicfields all use the correct plural path:$.components[0].name$.components[0].type$.components[0].version$.components[0].purl$.components[0].description$.components[0].licenses$.component.externalReferences(incorrect)The inconsistency is clear:
external_referencesis the only field using singular$.componentinstead of plural$.components[0].This causes
FieldRegistryManager.detect_field_presence()->_get_nested_value()to fail at line 258-260 ofsrc/models/registry.py:Bug 2: Snake_case field name vs camelCase BOM key in fallback checker
File:
src/models/scoring.py, lines 93-98When the jsonpath-based detection fails (bug 1), the scorer falls back to
check_field_in_aibom(). The relevant code:The field registry names this field
external_references(snake_case), but CycloneDX usesexternalReferences(camelCase). Theif field in componentcheck performs a literal key lookup, so"external_references" in {"externalReferences": [...]}returnsFalse.Other fields avoid this problem because their registry names match their BOM keys exactly (e.g.,
name,type,purl,description,licenses). Thecomponent_versionfield also has a name mismatch (component_versionvsversion), but it is rescued by the jsonpath-based detection (bug 1 doesn't affect it because its jsonpath$.components[0].versionis correct).Suggested Fix
Fix 1 (field_registry.json): Change the jsonpath from singular to plural:
This alone should fix the scoring because the enhanced checker (
check_field_with_enhanced_results) tries the jsonpath-based detection first (line 154-158 of scoring.py), and if that succeeds, it never reaches the fallback.Fix 2 (scoring.py, defense-in-depth): Add a camelCase alias check in the fallback, or normalize field names before lookup:
Fix 1 is sufficient on its own. Fix 2 provides defense-in-depth against similar issues in future field additions.
Impact
component_basiccategory, even when the BOM correctly contains external references.components(plural) array structure.Additional Context
I discovered this while publishing a model (rockCO78/crosswalk-v7c) and maximizing the AIBOM completeness score. The model card covers all 35 non-GGUF fields in the registry, achieving 97.1/100 -- with the remaining 2.9 points lost entirely to this bug.
Scoring output: