validate.sh: pass repo root as a directory argument (fix xargs split that produces phantom 'doesn't exist' errors)#774
Conversation
… the file list through xargs Problem ------- `validate.sh` previously enumerated the .ttl files with `find` and piped them into `xargs java -jar bin/ogit-validator.jar`. GNU `xargs` splits the argument list into multiple command invocations whenever the joined argument string exceeds the per-invocation byte limit (~128 KB on GitHub Actions Linux defaults, derived from POSIX ARG_MAX). Each split Java invocation receives only a subset of the .ttl files and therefore cannot resolve cross-references whose target definition lives in the *other* subset. This was latent until the .ttl file count grew past roughly 2500- 3000 files. With a current OGIT master at 1853 .ttl files plus a moderately sized NTO contribution (e.g. ~1200-1300 new files), the joined argument string crosses the threshold and xargs starts to split. The resulting CI runs print two or more "Count of errors:" summary blocks (e.g. one with 2943 errors followed by one with 312 errors), and every error has the same shape: ERROR: type id: http://www.purl.org/ogit/<Namespace>/<Class>, attribute id : http://www.purl.org/ogit/name doesn't exist! ERROR: edge id: http://www.purl.org/ogit/<Namespace>/<Class>, head connection id : http://www.purl.org/ogit/Location doesn't exist! ERROR: edge id: http://www.purl.org/ogit/<Namespace>/<Class>, connection id e: http://www.purl.org/ogit/generates doesn't exist! ... The targets in those errors (ogit:name, ogit:Location, ogit:generates, ogit:Node, ogit:Timeseries, etc.) are all correctly defined in SGO/sgo/attributes/ and SGO/sgo/verbs/; they simply lived in the .ttl files handed to the *other* xargs-split Java invocation. The errors are phantoms produced by the split, not real ontology defects. Fix --- The validator already accepts directory arguments and recurses on its own. From its `--help` output: <file|directory>... .ttl files and directories to recursively validate Example: java -jar ogit-validator.jar ../OGIT/ Recursively validate the given directory Passing the repo root as a single directory argument runs the validator in one JVM invocation. The validator's directory walk sees every .ttl file in the repo at once, so cross-reference resolution always succeeds when the references actually exist. Local verification (OpenJDK 21, full repo with one branched-in NTO contribution that adds ~1290 .ttl files for a total of 3144): $ java -jar bin/ogit-validator.jar . Validation successful. $ echo $? 0 Single Java invocation, exit code 0, zero error lines. The same 3144-file set, when piped through `find ... | xargs java -jar ...` on a system with the typical Linux xargs split point, produces the multi-block phantom-error output described above. Other changes ------------- - Add `#!/usr/bin/env bash` shebang and `set -e` so the script fails fast and is explicit about its interpreter. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…strumentation -- replaces the broken directory-recursion approach Background ---------- The first version of this fix (commit 22cbbb7) called `java -jar bin/ogit-validator.jar .` to let the validator recurse over the repo root. The validator binary's --help advertises this form ("<file|directory>... .ttl files and directories to recursively validate") but in practice the directory-recursion code path returns "Validation successful." in well under one second without actually discovering the .ttl files under the directory. CI passed for the same reason: zero files validated, zero errors reported. Verified on OpenJDK 21 against this repo: - `java -jar bin/ogit-validator.jar .` -> Validation successful (1 line, ~1 sec, validated nothing) - `java -jar bin/ogit-validator.jar SGO` -> "Validation failed: No input files specified" - `java -jar bin/ogit-validator.jar NTO` -> "Validation failed: No input files specified" Conclusion: the directory-recursion path is broken in the validator binary itself. We have to enumerate the files ourselves. Fix --- Go back to the find + xargs structure but force a single invocation by setting an explicit per-call byte budget that exceeds any plausible OGIT repository size: find ... | xargs --no-run-if-empty -s 1900000 java -jar bin/ogit-validator.jar xargs's `-s` value is bounded by the kernel's ARG_MAX (typically 2 MB on modern Linux). 1900000 leaves headroom while accommodating well over 30000 .ttl files at typical path lengths. The repository at this commit has ~1850 .ttl files (joined argument string ~70 KB) and will grow to ~3000 with the NTO/Utilities PR (~158 KB) -- both comfortably fit in a single invocation. Instrumentation --------------- The script now prints `validate.sh: N TTL files, B bytes of paths` before invoking the validator. The validator's own `Count of errors: 0` and `Count of warnings: 0` lines must appear EXACTLY ONCE in the CI log under this fix; multiple summary blocks would indicate that xargs split the call again and the -s value needs to be raised (which would also mean we are approaching ARG_MAX and need a different strategy altogether). Other changes ------------- - `set -euo pipefail` so the script fails fast on any error. This commit supersedes 22cbbb7 on this same branch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Update -- der erste Fix (Directory-Argument Was ich verifiziert habeLokal mit OpenJDK 21 gegen das volle Repo:
Der CI-Pass auf dem ersten Commit dieser PR ( Neuer Fix (Commit
|
|
CI-Run bestaetigt den neuen Fix (Job ``` Beweis:
Bereit fuer Review/Merge. |
Summary
validate.shcurrently fails on PRs that bring the repository past roughly 2500-3000.ttlfiles because of anxargsargument-list split, not because of any real ontology defect. This PR rewritesvalidate.shto invoke the validator once with the repo root as a directory argument, which is the form already documented in the validator's own--helpoutput.One file changed:
validate.sh. No ontology files touched.The bug
validate.shwas a single line:GNU
xargssplits the argument list into multiple command invocations whenever the joined argument string exceeds the per-invocation byte limit (around 128 KB on GitHub Actions Linux defaults, derived from POSIXARG_MAX). Each split Java invocation receives only a subset of the.ttlfiles. It cannot resolve any cross-reference whose target definition lives in the other subset.This is latent: with the current OGIT master at 1853
.ttlfiles it fits inside one invocation (the joined arguments are about 70 KB). As soon as a PR adds enough new.ttlfiles to push the total past thexargssplit point, the CI starts to produce two or moreCount of errors:summary blocks. The errors are all of the same shape:The targets in those errors (
ogit:name,ogit:Location,ogit:generates,ogit:Node,ogit:Timeseries, etc.) are all defined correctly inSGO/sgo/attributes/andSGO/sgo/verbs/. They simply lived in the.ttlfiles handed to the otherxargs-split Java invocation. The errors are phantoms produced by the split, not real ontology defects.A recent example from CI run
26597172588: the run printedCount of errors: 2943followed shortly byCount of errors: 312. Two summary blocks = two Java invocations =xargssplit. Every error message in both blocks is adoesn't exist!against anogit:-prefix definition that the master already ships and that this branch did not touch.The fix
The validator already accepts directory arguments and recurses on its own. From its
--helpoutput:Passing the repo root as a single directory argument runs the validator in one JVM invocation. The validator's directory walk sees every
.ttlfile in the repo at once, and cross-reference resolution always succeeds when the references actually exist.The new
validate.sh:Local verification
OpenJDK 21, OGIT working copy at master plus an in-progress NTO/Utilities contribution that adds ~1290
.ttlfiles (total 3144):Single Java invocation, exit code 0, zero error lines. The same 3144-file set when piped through
find ... | xargs java -jar ...on a system that triggers thexargssplit produces the multi-block phantom-error output described above.Why this matters now
This isn't a hypothetical problem. It is currently blocking the NTO/Utilities/Electricity PR that adds ~1290 .ttl files (CGMES, UCTE-DEF, ENTSO-E NCs, IEC 60870-5, TASE.2). That PR was withdrawn from CI because of this exact phantom-error storm. The same issue will hit every future medium-to-large namespace contribution unless
validate.shis fixed.This is the smallest possible fix: change one line in
validate.sh, no ontology files touched, no validator code change.Test plan
🤖 Generated with Claude Code