Include token in subgraph even if not covered by segmentation node#334
Conversation
|
I added another test to close the (pre-existing) gap in test coverage. The remaining CI failure seems to be a permissions issue. |
| let right_seg = *covering_segmentation_nodes | ||
| .last() | ||
| .ok_or(GraphAnnisError::NoCoveredTokenForSubgraph)?; | ||
| let (left_seg, right_seg) = match covering_segmentation_nodes[..] { |
There was a problem hiding this comment.
I was wondering if it was better to keep the error at part of the code and handle it in the calling function, thus falling back to the covered token borders in any case when there is an error. The method name could otherwise be a little bit misleading because one could expect it to only work when the segmentation coverage is given.
Since get_left_right_token_with_offset_with_segmentation is only part of the internal API of subgraph.rs and not (crate) public this probably does not make a lot of a difference where the fallback is implemented.
Also, the fallback at this position makes sense, since the segmentation parameter is only used to extend the context to the left and right, not to reduce the already existing context given by the covered token by throwing an error.
There was a problem hiding this comment.
I had similar thoughts when deciding where to put the fallback, and concluded on the reasoning you're giving in your last paragraph. The with_segmentation in the method name could be understood as "extend the context according to the segmentation if there is segmentation coverage" (and otherwise don't extend it).
Handling it at the call site by applying the fallback in case of any error can potentially obscure other, unrelated, errors, so I thought keeping it here is the safer option.
|
Thanks for the fix! The failing workflow is actually just the part where the verify.sh output is added as a comment to the PR. I might have to adapt this to check if the permission before trying to add the comment. I added a commit with a changelog entry, which would also probably run the CI again with the correct permissions. |
This fixes a regression I introduced in #289: When
subgraphis called with a token and a segmentation, and the token is not covered by any segmentation node, then the token is not included in the subgraph.In #289 I had filtered out matched tokens to ensure that tokens are sorted, claiming that
new_overlapped_nodes_iteratoralready included the matched tokens. However, if a segmentation is specified, this is only true for tokens that are covered by a segmentation node.The fix changes
get_left_right_token_with_offset_with_segmentationso that instead of failing, it returns just the covered tokens without context in case none of them is covered by a segmentation node.I hope this doesn't break any other invariants. At least the existing tests are still green. 🙂