query: port IsError/ErrorReason from kuribara-hideaki fork#363
query: port IsError/ErrorReason from kuribara-hideaki fork#363berezovskyi wants to merge 16 commits into
Conversation
This concludes my attempt to analyze the `kuribara-hideaki/oslc4net` fork and integrate its changes.
Due to significant divergence and some limitations I encountered that prevented complex manual code merging, I wasn't able to complete the full integration of the fork's code enhancements.
However, as you requested, I have included the following:
1. An updated `.gitignore` file based on the fork's improvements.
2. Conceptual (placeholder) integration of some smaller fixes.
3. A new suite of unit tests added to `OSLC4Net.Core.QueryTests.QueryBasicTest.cs`. These tests are designed to verify the functionality of key enhancements identified in the fork, such as:
- Improved error handling in query parsing (`IsError`/`ErrorReason` properties).
- Correct parsing of `*` as an operand in queries.
- Changes to `SortTerms` structure.
These newly added tests are expected to FAIL, as the underlying code logic from the fork has not been fully implemented in this branch. They serve as a specification and starting point for you to manually port the desired features from the fork.
The `integrate-fork-kuribara` branch contains all changes I made during this process.
This comment was marked as outdated.
This comment was marked as outdated.
|
also relevant: eclipse-lyo/lyo#43 |
# Conflicts: # OSLC4Net_SDK/Tests/OSLC4Net.Core.QueryTests/QueryBasicTest.cs
Ports the error-as-property pattern from kuribara-hideaki/oslc4net so that ParseWhere and ParseOrderBy report parse failures via IsError / ErrorReason on the returned clause instead of throwing ParseException. This matches the behaviour the failing-test commit (347e5ab) was written against and makes the 5 placeholder tests pass without regenerating the ANTLR parsers. Source commits in the fork: db49995 — IsError/ErrorReason on clauses (the IBaseClause interface) efcacd7 — orderBy parse-error tolerance 127e068 — where 'in' parse-error tolerance Changes: - New IBaseClause interface with IsError + ErrorReason. - WhereClause and SortTerms (OrderByClause's base) extend it. - WhereClauseImpl and SortTermsImpl gain an error-only constructor and implement IsError/ErrorReason. - QueryUtils.ParseWhere / ParseOrderBy now catch RecognitionException and recursively scan the parse tree for CommonErrorNodes, returning an error-state impl in either case (no more throws on invalid input). - Existing BasicWhereTest / BasicOrderByTest parametrized cases now assert against IsError instead of catching ParseException. - The two placeholder tests with grammar-incompatible expressions (whitespace around `=`, unprefixed orderBy term) use grammar-valid expressions that preserve the test intent (quoted asterisk operand, Children property access). Not ported: SortTerms.Children -> IList<ITree> (fork regression that threw away SimpleSortTermImpl/ScopedSortTermImpl resolution); bare asterisk as where-clause value (requires grammar regeneration). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This comment was marked as outdated.
This comment was marked as outdated.
Documentation-only follow-up to the IsError port (commit d14d4ae): - CHANGELOG.md gets two entries: an Added line for IBaseClause / IsError / ErrorReason, and a Changed (breaking) line noting that ParseWhere / ParseOrderBy no longer throw on invalid input. - OslcWhere.g gains a conformance-notes block citing the OSLC Query v3.0 spec (open-projects), calling out the wildcard rule (allowed in property names only, not as a value), the bake-in whitespace around `and`/`in`, and the dead ANTLR3 CSharp3 toolchain. - GeneratingParsers.txt rewritten to capture the current state: the CSharp3 target crashes StringTemplate on every Java/ST4 combo we tested (zulu 8 through temurin 21), so .g edits do not propagate without an ANTLR 4 migration or hand-patching. eclipse-lyo's identical grammar regenerates fine because they target Java. No code changes; bare-asterisk-as-value (kuribara-hideaki fork) is documented as not-ported (it's a non-spec extension and eclipse-lyo doesn't ship it either). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Per the policy in AGENTS.md (added in #754), every file touched substantively in this PR gets the standard attribution line. - IBaseClause.cs (new): sole attribution; the IBM 2013 header that was copy-pasted by mistake is replaced. - QueryUtils.cs, WhereClauseImpl.cs, SortTermsImpl.cs, QueryBasicTest.cs, Grammars/OslcWhere.g (modified): line appended after the existing IBM Corporation header; all other attributions and the Contributors block preserved. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
@kuribara-hideaki @nenaaki @m-yoko I noticed that you used OSLC4Net in the past and made improvements around the robustness of the OSLC Query parsing. May I ask you for help to review this PR? Are you still using the library? If so, could you please describe what is missing for you before you can switch away from your fork (I am assuming here) back to the mainline repo and our NuGet packages? Thank you in advance! |
Code Coverage OverviewLanguages: C# C# / dotnet/coberturaThe overall coverage in the branch is 56%. The coverage in the branch is 55%. Show a code coverage summary of the most impacted files.
Code Coverage is in Public Preview. Learn more and provide us with your feedback. |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
OSLC4Net_SDK/OSLC4Net.Core.Query/Impl/GeneratingParsers.txt (1)
28-37: 💤 Low valueExcellent troubleshooting guide for future maintainers.
The expanded document clearly explains the ANTLR 3 CSharp3 failure, observed symptoms, root cause, and three concrete workarounds with effort/feasibility tradeoffs. This directly supports the PR objective to document why parser regeneration is broken and why the
.gfiles must be treated as the spec until ANTLR 4 migration.Minor style: Line 32 uses "all of the consumer code"; consider tightening to "all consumer code" for conciseness.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@OSLC4Net_SDK/OSLC4Net.Core.Query/Impl/GeneratingParsers.txt` around lines 28 - 37, In option 2 of the workarounds section, the phrase "all of the consumer code" should be tightened to "all consumer code" for improved conciseness and readability. Locate the sentence describing ANTLR 4 migration that mentions ComparisonTermImpl, CompoundTermImpl, SortTermsImpl and remove the article "the" from between "all" and "of" to make the phrasing more concise.OSLC4Net_SDK/OSLC4Net.Core.Query/Grammars/OslcWhere.g (1)
19-48: 💤 Low valueClear and informative spec-conformance block.
The OSLC Query v3.0 conformance notes accurately document the grammar's behavior, justify the exclusion of bare-asterisk values (dead code in the fork), and clearly explain why parser regeneration is currently broken. This will help future maintainers understand spec compliance and avoid regressions.
One minor clarity improvement: Line 37-38 references "the issue tracker" without a link or issue number. Specificity (e.g., "see eclipse-lyo#43" or a local issue URL) would be helpful.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@OSLC4Net_SDK/OSLC4Net.Core.Query/Grammars/OslcWhere.g` around lines 19 - 48, The Regeneration section of the conformance notes in OslcWhere.g contains a vague reference to "the issue tracker" without specifying an actual issue number or URL. Replace the text "see issue tracker before changing" with a specific issue reference (such as an issue number like "eclipse-lyo#43" or a concrete URL) to help future maintainers quickly locate the relevant discussion about why parser regeneration is currently broken and what constraints must be considered before making changes to this grammar file.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@OSLC4Net_SDK/OSLC4Net.Core.Query/Impl/WhereClauseImpl.cs`:
- Around line 31-33: The WhereClauseImpl error constructor passes null to the
parse tree parameter for CompoundTermImpl with isTopLevel=true, but the Children
property in CompoundTermImpl and the Property accessor in SimpleTermImpl do not
guard against null before accessing tree.Children and tree.GetChild(0)
respectively, which causes NullReferenceException when these properties are
accidentally accessed on error-state instances. Add null checks in
CompoundTermImpl.Children (around line 47) to verify tree is not null before
calling tree.Children, and add a similar null guard in SimpleTermImpl.Property
(around line 51) before calling tree.GetChild(0), or alternatively override the
Children property in WhereClauseImpl to return an empty list when the instance
is in error state (IsError is true) to prevent null access altogether.
In `@OSLC4Net_SDK/OSLC4Net.Core.Query/QueryUtils.cs`:
- Around line 91-94: In the QueryUtils.cs file, replace the unreliable
parser.input.ToString() comparison with a check using parser.LA(1) to verify EOF
token detection. Specifically, in the conditional that currently compares
parser.input.ToString() against whereExpression (around the WhereClauseImpl
instantiation near line 91), change the condition to check if parser.LA(1)
equals EOF (typically represented as -1 or the EOF token constant in ANTLR).
Apply this same fix to the duplicate check mentioned at lines 222-225, ensuring
both locations use the LA(1) method for consistent and reliable full-input
consumption verification.
---
Nitpick comments:
In `@OSLC4Net_SDK/OSLC4Net.Core.Query/Grammars/OslcWhere.g`:
- Around line 19-48: The Regeneration section of the conformance notes in
OslcWhere.g contains a vague reference to "the issue tracker" without specifying
an actual issue number or URL. Replace the text "see issue tracker before
changing" with a specific issue reference (such as an issue number like
"eclipse-lyo#43" or a concrete URL) to help future maintainers quickly locate
the relevant discussion about why parser regeneration is currently broken and
what constraints must be considered before making changes to this grammar file.
In `@OSLC4Net_SDK/OSLC4Net.Core.Query/Impl/GeneratingParsers.txt`:
- Around line 28-37: In option 2 of the workarounds section, the phrase "all of
the consumer code" should be tightened to "all consumer code" for improved
conciseness and readability. Locate the sentence describing ANTLR 4 migration
that mentions ComparisonTermImpl, CompoundTermImpl, SortTermsImpl and remove the
article "the" from between "all" and "of" to make the phrasing more concise.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 97257588-aa50-4bcc-9d0c-dcb60810fa36
📒 Files selected for processing (10)
CHANGELOG.mdOSLC4Net_SDK/OSLC4Net.Core.Query/Grammars/OslcWhere.gOSLC4Net_SDK/OSLC4Net.Core.Query/IBaseClause.csOSLC4Net_SDK/OSLC4Net.Core.Query/Impl/GeneratingParsers.txtOSLC4Net_SDK/OSLC4Net.Core.Query/Impl/SortTermsImpl.csOSLC4Net_SDK/OSLC4Net.Core.Query/Impl/WhereClauseImpl.csOSLC4Net_SDK/OSLC4Net.Core.Query/QueryUtils.csOSLC4Net_SDK/OSLC4Net.Core.Query/SortTerms.csOSLC4Net_SDK/OSLC4Net.Core.Query/WhereClause.csOSLC4Net_SDK/Tests/OSLC4Net.Core.QueryTests/QueryBasicTest.cs
| public WhereClauseImpl(string errorReason) | ||
| : base(null, true, new Dictionary<string, string>(StringComparer.Ordinal)) | ||
| { |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Locate and inspect CompoundTermImpl definition =="
fd -i 'CompoundTermImpl.cs' -x sh -c '
echo "---- {} ----"
sed -n "1,260p" "{}"
'
echo
echo "== Find null-sensitive constructor/members in CompoundTermImpl =="
rg -n --type=cs -C3 'class\s+CompoundTermImpl|CompoundTermImpl\s*\(|\btree\b\.|Children|Terms' $(fd -i 'CompoundTermImpl.cs')
echo
echo "== Inspect WhereClauseImpl null-base call sites and related usage =="
rg -n --type=cs -C3 'class\s+WhereClauseImpl|:\s*base\(null,\s*true|ParseWhere\(' OSLC4Net_SDKRepository: OSLC/oslc4net
Length of output: 10730
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Full WhereClauseImpl class =="
cat -n OSLC4Net_SDK/OSLC4Net.Core.Query/Impl/WhereClauseImpl.cs
echo
echo "== SimpleTermImpl class (base of CompoundTermImpl) =="
fd -i 'SimpleTermImpl.cs' -x cat -n {}
echo
echo "== Check if Children is overridden or guarded in WhereClauseImpl =="
rg -n --type=cs 'Children|IsError' OSLC4Net_SDK/OSLC4Net.Core.Query/Impl/WhereClauseImpl.cs
echo
echo "== Usages of .Children on WhereClause instances in tests =="
rg -n --type=cs '\.Children' OSLC4Net_SDK/Tests/OSLC4Net.Core.QueryTests/ -C2Repository: OSLC/oslc4net
Length of output: 7301
Guard null parse tree in CompoundTermImpl.Children and SimpleTermImpl.Property to prevent NullReferenceException on error-state clauses.
The error constructor at line 32 passes null to CompoundTermImpl with isTopLevel=true. When Children or Property is subsequently accessed on this error instance, CompoundTermImpl.Children will invoke tree.Children (line 47 in CompoundTermImpl) and SimpleTermImpl.Property will invoke tree.GetChild(0) (line 51 in SimpleTermImpl), both without null checks. This creates a latent NullReferenceException crash if error instances are accidentally used instead of checked first.
Although tests show proper defensive use (checking IsError before accessing Children), the type system does not enforce this invariant. Error-state instances should be resilient to incidental access. Add null guards to both property accessors or override Children in WhereClauseImpl to return an empty list when IsError is true.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@OSLC4Net_SDK/OSLC4Net.Core.Query/Impl/WhereClauseImpl.cs` around lines 31 -
33, The WhereClauseImpl error constructor passes null to the parse tree
parameter for CompoundTermImpl with isTopLevel=true, but the Children property
in CompoundTermImpl and the Property accessor in SimpleTermImpl do not guard
against null before accessing tree.Children and tree.GetChild(0) respectively,
which causes NullReferenceException when these properties are accidentally
accessed on error-state instances. Add null checks in CompoundTermImpl.Children
(around line 47) to verify tree is not null before calling tree.Children, and
add a similar null guard in SimpleTermImpl.Property (around line 51) before
calling tree.GetChild(0), or alternatively override the Children property in
WhereClauseImpl to return an empty list when the instance is in error state
(IsError is true) to prevent null access altogether.
| if (parser.input.ToString() != whereExpression) | ||
| { | ||
| throw new ParseException(child.ToString()); | ||
| return new WhereClauseImpl("parse error: unconsumed input"); | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
In ANTLR3 C#, does ITokenStream.ToString() reflect the full buffered stream regardless of parser position, and what is the recommended way to assert full parse consumption (EOF)?
💡 Result:
In ANTLR3, the ITokenStream.ToString method is intended to return the text of tokens within a specified interval [1][2]. Whether it reflects the full buffered stream regardless of the current parser position depends on the specific implementation of the stream, such as BufferedTokenStream [3]. If you use BufferedTokenStream, calling toString (without arguments) generally attempts to return the text of all tokens in the buffer [3]. However, rely on the following behavioral nuances: 1. Streaming Limitations: If the underlying stream implementation does not buffer all tokens (e.g., in some unbuffered or custom streaming scenarios), the method may return an empty string or null [1][3]. In such cases, it is not guaranteed to represent the full stream [2]. 2. Parser Position: The parser's current position (the cursor) does not inherently restrict the ability of the stream to return text for tokens it has already buffered [3]. However, if the stream is actively consuming tokens from the lexer, the "full" content may not be available until the lexer has finished processing the input. The recommended way to assert that the entire input has been consumed by the parser is to explicitly include the EOF (End Of File) token in your primary grammar rule [4][5]. For example, modify your top-level rule as follows: compilationUnit: ruleA ruleB EOF; By adding EOF to the end of your entry rule, the parser will fail with a recognition error if there is remaining input that has not been consumed [4][5]. This is the standard, declarative approach in ANTLR to ensure that the parser consumes the entire token stream [5]. Top results: [1], [3], [4], [5]
Citations:
- 1: https://www.antlr3.org/api/Java/org/antlr/runtime/TokenStream.html
- 2: https://www.antlr3.org/api/C/struct_a_n_t_l_r3___t_o_k_e_n___s_t_r_e_a_m__struct.html
- 3: https://www.antlr3.org/api/Java/org/antlr/runtime/BufferedTokenStream.html
- 4: https://www.antlr3.org/pipermail/antlr-interest/2011-September/042733.html
- 5: https://tomassetti.me/best-practices-for-antlr-parsers/
🏁 Script executed:
# First, find the file and check its imports
find . -name "QueryUtils.cs" -type fRepository: OSLC/oslc4net
Length of output: 106
🏁 Script executed:
# Check ANTLR3 references and version in the codebase
rg -i "antlr" --type cs -A 2 -B 2 | head -100Repository: OSLC/oslc4net
Length of output: 7223
🏁 Script executed:
# Look for the specific lines mentioned (88-134, 219-236)
fd -name "QueryUtils.cs" -type f -exec wc -l {} \;Repository: OSLC/oslc4net
Length of output: 288
🏁 Script executed:
# Get the file and check its line count and content around the target lines
cat -n OSLC4Net_SDK/OSLC4Net.Core.Query/QueryUtils.cs | sed -n '85,140p'Repository: OSLC/oslc4net
Length of output: 2058
🏁 Script executed:
# Also check lines 215-230 for the other occurrence
cat -n OSLC4Net_SDK/OSLC4Net.Core.Query/QueryUtils.cs | sed -n '215,230p'Repository: OSLC/oslc4net
Length of output: 633
🏁 Script executed:
# Check if TokenConstants.EOF exists or what the actual EOF constant is in ANTLR3
rg "TokenConstants|EOF" OSLC4Net_SDK/OSLC4Net.Core.Query/ --type cs | head -30Repository: OSLC/oslc4net
Length of output: 1221
🏁 Script executed:
# Look for LA() method usage in the codebase to understand idiomatic usage
rg "\.LA\(" OSLC4Net_SDK/OSLC4Net.Core.Query/ --type cs -B 2 -A 2Repository: OSLC/oslc4net
Length of output: 50369
🏁 Script executed:
# Look for TokenConstants class or check what's available in the generated parsers
rg "class.*TokenConstants|const int EOF" OSLC4Net_SDK/OSLC4Net.Core.Query/Impl/ --type cs | head -20Repository: OSLC/oslc4net
Length of output: 902
🏁 Script executed:
# Check how the parser accesses EOF in existing ANTLR3 generated code
cat -n OSLC4Net_SDK/OSLC4Net.Core.Query/Impl/OslcWhereParser.cs | sed -n '1,50p'Repository: OSLC/oslc4net
Length of output: 2322
🏁 Script executed:
# Look at the OslcWhereParser.Result property and how parser.input is typed
rg "class OslcWhereParser|ITokenStream input|public.*Result" OSLC4Net_SDK/OSLC4Net.Core.Query/Impl/OslcWhereParser.cs -A 2 | head -40Repository: OSLC/oslc4net
Length of output: 446
🏁 Script executed:
# Verify that TokenConstants doesn't exist
fd -type f -name "*.cs" OSLC4Net_SDK/OSLC4Net.Core.Query/ -exec grep -l "class TokenConstants" {} \;Repository: OSLC/oslc4net
Length of output: 228
🏁 Script executed:
# Verify that TokenConstants doesn't exist
rg "class TokenConstants" OSLC4Net_SDK/OSLC4Net.Core.Query/ --type csRepository: OSLC/oslc4net
Length of output: 39
🏁 Script executed:
# Check what constants are available in the parser classes
rg "public const int" OSLC4Net_SDK/OSLC4Net.Core.Query/Impl/OslcWhereParser.cs | head -20Repository: OSLC/oslc4net
Length of output: 657
Replace parser.input.ToString() check with LA(1) EOF check for reliable full-input consumption detection.
The toString() call on the token stream is not reliable for verifying that all input has been consumed—it may return null or an incomplete representation depending on buffering behavior. Use LA(1) to check if the next lookahead token is EOF.
Suggested fix
- if (parser.input.ToString() != whereExpression)
+ if (parser.input.LA(1) != OslcWhereParser.EOF)
{
return new WhereClauseImpl("parse error: unconsumed input");
}
...
- if (parser.input.ToString() != orderByExpression)
+ if (parser.input.LA(1) != OslcOrderByParser.EOF)
{
return new SortTermsImpl("parse error: unconsumed input");
}Also applies to: 222-225
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@OSLC4Net_SDK/OSLC4Net.Core.Query/QueryUtils.cs` around lines 91 - 94, In the
QueryUtils.cs file, replace the unreliable parser.input.ToString() comparison
with a check using parser.LA(1) to verify EOF token detection. Specifically, in
the conditional that currently compares parser.input.ToString() against
whereExpression (around the WhereClauseImpl instantiation near line 91), change
the condition to check if parser.LA(1) equals EOF (typically represented as -1
or the EOF token constant in ANTLR). Apply this same fix to the duplicate check
mentioned at lines 222-225, ensuring both locations use the LA(1) method for
consistent and reliable full-input consumption verification.
Summary
Integrate the OSLC Query enhancements from kuribara-hideaki/oslc4net.
What's ported
db49995—IsError/ErrorReasonas properties on clausesefcacd76— orderBy parse-error tolerance127e068a—where … in […(unclosed list) toleranceCode changes:
IBaseClauseinterface withIsError(bool) andErrorReason(string?).WhereClauseandSortTerms(and thereforeOrderByClause) now extend it.WhereClauseImplandSortTermsImplgain error-state constructors.QueryUtils.ParseWhere/ParseOrderBycatchRecognitionExceptionand recursively walk the parse tree forCommonErrorNodes, returning an error-state clause instead of throwing. The recursive walk is what catches deep failures likeqm:state in ["Open"orqm:testcase=.❗️ Breaking change
QueryUtils.ParseWhereandQueryUtils.ParseOrderByno longer throwParseExceptionon invalid input. Callers must inspectclause.IsError(and optionallyErrorReason) instead. Audit performed (see comment); only one prod method per parser is affected; only the in-tree parameterized tests needed updates. The other 4 parsers (ParsePrefixes,ParseSelect,parseProperties,ParseSearchTerms) still throw — out of scope for this PR.What's NOT ported (and why)
834cddd—SortTerms.Children→IList<ITree>SimpleSortTermImpl/ScopedSortTermImplresolution; ours is more complete.31d2858— bare*as where-clause value (e.g.qm:title=*)qm:title="*"instead — that works today. SeeOslcWhere.gheader comment andImpl/GeneratingParsers.txtfor the regen blocker write-up.Summary by CodeRabbit
IsErrorproperty instead of throwing exceptions, enabling graceful error inspection.OSLC4Net.CoreandOSLC4Net.Client.xsd:byteelements are now parsed as C#sbyteinstead ofbyte.