Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
|
I don't like how the return type is Switched it to wrapping the non-tiled result as a single-element vector. The main use case should be comparing full layouts, and that should really be done tiled anyway for performance reasons, but at least this way you can directly swap to the non-tiled version for testing if you're worried about edge artifacts. Also just use the exact tile size in the keyword argument, don't try to be clever and fit it to the layout. |
|
I'm having trouble identifying what exactly changed in the code that was moved to clipping.jl. Giving it a quick skim, it looks like it wasn't just a copy-paste from polygons.jl, for example some |
| function is_sliver(p::Polygon{T}; atol=DeviceLayout.onenanometer(T)) where {T} | ||
| return area(p) / perimeter(p) < atol | ||
| end |
There was a problem hiding this comment.
Is this a heuristic, or does this formula come from somewhere? Also it looks like it isn't used anywhere other than for testing, would it be cleaner to just define this in the test files and leave it out of the API docs?
There was a problem hiding this comment.
It's a heuristic, but pretty common, I think. What we really want is something like "you could move the boundary by less than atol and zero out the area", and this is basically that with an "on average". I'm missing a factor of 2 compared with the version here, which I've now added in. The idea is that a rectangle with length l and width atol has 2*area/perimeter approximately atol. I think this would be of practical use in the API for filtering geometry diffs.
The other sliver definition I've found is in ArcGIS, which uses a "thinness parameter" T = 4pi A / P^2 (1 for a circle, 0 for a line), but I like having a tolerance with an interpretable dimension.
There was a problem hiding this comment.
I see ok, yeah I can see how this holds for very small rectangles, but my guess is that it's a bad approximation for arbitrary polygons with side lengths >> atol. For "round trip XORs" like the ones you did here all residual polygons after the XOR will have this property (side lengths < atol) so this seems like a useful filtering function. But how often are users going to be doing these kinds of round-trip checks? This seems more like something we'd only use during internal testing. This package's API is already huge so IMO we should start raising the bar for what gets added to it.
There was a problem hiding this comment.
I think it’s surprisingly hard to get false positives, but it’s possible for “flag” polygons with a long sub-nm sliver sticking out from one corner to be counted as “on average” slivers this way. I’ve seen those generated by layout programs (not ours) that don’t make keyhole polygons correctly, so it’s not totally exotic, although a millimeter-long sliver on a square micron of real area would be pretty extreme.
I’ll move it to the test setup block, since it’s used in multiple test items across the suite.
Sorry about that. I tried moving clipping back inside polygons.jl to improve the diff but it didn't help, too much had been moved. The first commit is a clean move of code from polygons.jl to clipping.jl, so if you look at this diff |
|
Rebased and added handling of empty entity lists to clip_tiled. The diff after the code move is now here. |
| union2d(p::AbstractGeometry{T}) where {T} = union2d(p, Polygon{T}[]) | ||
| union2d(p::AbstractArray{<:AbstractGeometry{T}}) where {T} = union2d(p, Polygon{T}[]) | ||
| union2d(p::AbstractArray) = union2d(p, Polygon{DeviceLayout.coordinatetype(p)}[]) | ||
| union2d(p::Pair{<:AbstractGeometry{T}}) where {T} = union2d(p, Polygon{T}[]) |
There was a problem hiding this comment.
Why was it necessary to add this for union2d but not the other boolean ops?
There was a problem hiding this comment.
Only union2d has single-argument methods. (You can call the others with an empty second argument but you'd only do it on purpose for union.)
| end) | ||
| end | ||
|
|
||
| function clip_layerwise( |
There was a problem hiding this comment.
I'm not convinced that we need to have a brand new interface for layerwise clipping. IIUC current booleans union2d, difference2d, etc. can currently take arguments that are any combination of
GeometryEntityArray{<:GeometryEntity}GeometryStructureGeometryReferencePair{<:Union{GeometryStructure, GeometryReference}}
where options (3, 4) do a flattening operation and then take the elements, and option (5) grabs a specific layer and flattens. So they already have rich behavior, and therefore adding an additional layerwise keyword would seem quite reasonable to me. Basically a boolean flag that affects the behavior of options (3, 4). That way, tiled clipping could also be an option there; right now it's only applied to the layerwise methods and I'm not sure why if the performance benefits can be significant.
There was a problem hiding this comment.
The main reason I didn't do that is that the return type would then depend on the value of a keyword argument, which seems questionable for both interface and type-stability reasons. The current booleans always return a single ClippedPolygon. Tiling should return an iterable over ClippedPolygon (lazy iterator if it's going to be useful), and layerwise operations have to return a Dict{Meta}. I already have non-tiled layerwise operations wrapping Dict values as a vector with a single ClippedPolygon per layer for interface consistency with the tiled version.
I didn't add union2d_tiled etc because I think in practice big (>1000 polygon?) boolean operations tend to be layerwise anyway, but that would be as simple as adding union2d_tiled(...) = clip_tiled(Clipper.ClipTypeUnion, ...) etc if you think it's worth it. I'm reluctant to add it as a keyword option for the same reason as above.
There was a problem hiding this comment.
Ok I guess I can see the argument both ways when it comes to complicating the interface so happy to concede on this one. Regarding union2d_tiled etc, that makes sense, right now I can't think of any situations when I've asked for massive bools that are not layerwise so let's just leave that out of scope here and add it later if the need arises?
| tool::GeometryStructure; | ||
| only_layers=[], | ||
| ignore_layers=[], | ||
| tile_size=nothing, |
There was a problem hiding this comment.
Would it be possible to automatically compute the tile size and hide this from the user (instead only offering a tiled::Bool keyword to turn it on or off)? This would be a huge simplification to the interface, I suspect the laziness barrier will keep a lot of people from ever accessing this functionality if they have to manually figure out and pass a tile size that is good.
There was a problem hiding this comment.
Sure, added a tiled keyword that heuristically aims for 100 entities per tile. Left tile_size in as a manual override.
| findbox, | ||
| footprint, | ||
| halo, | ||
| mbr_spatial_index, |
There was a problem hiding this comment.
Can we keep these off the public API for now? I don't see the direct value to users, and I also don't like that you have to manually pass an index to findbox if you don't want it to re-index every single time. This feels like a use case for caching (which we can implement later) since based on my cursory reading of the Wikipedia entry on R-Trees, building the trees can be quite expensive.
There was a problem hiding this comment.
Yeah, makes sense, removed those from exports and HTML docs. Agree that we should avoid giving users a way to accidentally re-index in a hot loop, e.g. looks like findbox for all tiles combined has a cost comparable to building one tree in the tests here.
| tiled=false, | ||
| tile_size=nothing, | ||
| depth=-1, | ||
| pfs=Clipper.PolyFillTypePositive, |
There was a problem hiding this comment.
I noticed that both this and clip_tiled default to PolyFillTypePositive, but the original clip defaults to PolyFillTypeEvenOdd. This is usually masked by the fact that the user-facing methods like union2d actually pass PolyFillTypePositive to clip by default, but if anyone were to ever use the bare clip or clip_* methods they'd get different behavior.
There was a problem hiding this comment.
Hmm, looks like the clip default is a holdover from Devices.jl. It's basically never what we want, and internally clip is used directly in one place, where it does use the default but the polyfill type doesn't matter. I'd lean towards changing it to Positive, but if you think that's too risky (clip is exported) I could change the clip_tiled default to EvenOdd for consistency, or leave it as is.
There was a problem hiding this comment.
I know that for our internal users it wouldn't be a risk (99% sure at least) since I've never seen code that uses clip directly; for external users I don't know. Your call but if it were up to me I'd just change the clip default to Positive (implicitly assuming that that's already the behavior that pretty much everyone's been using through union2d, difference2d, etc.)
By the way, related to this - this is why I have a bias towards not exporting functions unless there's a pretty good reason to. It makes it harder to introduce non-breaking changes down the line.
There was a problem hiding this comment.
OK, default changed. Fair point on exporting functions. Really clip shouldn't be public at all now that we have union2d etc, that's just another Devices.jl legacy.
| Because entities touching more than one tile will be included in multiple operations, | ||
| resulting polygons may be duplicated or incorrect. This is often acceptable, as in the case of `xor2d_layerwise` when the goal | ||
| is to find small differences between layouts: layouts are never incorrectly identified as identical, | ||
| and false positives are rare. In other cases, you may need to do some postprocessing or of the results. |
There was a problem hiding this comment.
I missed this on my first pass, but this sounds quite concerning. I agree this sounds OK for XOR but for difference2d you could easily get errors. Why aren't we clipping entities to their containing tiles to avoid this altogether?
There was a problem hiding this comment.
I think that should work, it’s just an extra intersection operation now that #196 lets us skip the intermediate interiorcuts. I’ll benchmark to see how expensive it is. If it’s significant I can try only intersecting entities found touching tile edges with findbox, or add a keyword option.
There was a problem hiding this comment.
OK, intersecting everything with the whole tile doubles the time, but if we only intersect entities touching a tile edge, there's no measurable difference in the QPU17 case. Added logic to do that (always, no keyword option) and added a test covering it.
There was a problem hiding this comment.
Nice! Thanks. Yeah this would have been a correctness problem in some cases but I'm pretty sure with this fix it should be fine now.
Move clipping to its own file
Layerwise clipping Fix accidental reversion of 1D entity change Add tests with area, is_sliver Add more docstrings and CHANGELOG Add findbox and boolean interface tests Use exact tile size, wrap non-tiled result in vector Make sure mbr always uses float coordinates Fix bugged format Fix h_edges, remove unused code Re-format and add docstrings after rebase Add factor of 2 to sliver formula Move is_sliver definition to test setup
|
|
||
| tiles, edges = tiles_and_edges(bnds_dl, tile_size) # DeviceLayout Rectangles | ||
| # Get single vector of all entity indices touching any edge | ||
| edge_touching_idx1 = mapreduce(vcat, edges, init=Int[]) do edge |
There was a problem hiding this comment.
It seems like there can be a lot of repeated indices here, which will translate to a more expensive membership check in line 2536 below. I tried wrapping the output of mapreduce with Set and it halved the runtime of xor2d_layerwise using a contrived example I constructed (both in the case of auto tile size and manual tile size). It also reduced memory allocation significantly (>2x). Try it on your end and see if you can see a similar improvement.
There was a problem hiding this comment.
Good call, change applied.

Update of #41 using a lazy iterator over tiles (giving up on healing). Adds layerwise booleans
xor2d_layerwiseetc applied across pairs of GeometryStructures.Closes #100 and #30.