Safe Access to Contiguous Storage#2307
Conversation
Co-authored-by: Alex Martini <amartini@apple.com>
- this is redundant, since it is the same as `String.UTF8View`
|
|
||
| ```swift | ||
| extension Span where Element: ~Copyable & ~Escapable { | ||
| func withUnsafeBufferPointer<E: Error, Result: ~Copyable & ~Escapable>( |
There was a problem hiding this comment.
I don't think Result can be ~Escapable here without the lifetimes proposal.
I have a meta-question about the design here. Span is already non-escaping itself, and these operations that produce an unsafe buffer pointer obviously have to break that requirement. Given that, it doesn't feel like using the closure here is providing extra safety over (e.g.) a property that produces the UnsafeBufferPointer, but it's a lot less ergonomic. Could we consider instead to have:
var unsafeBufferPointer: UnsafeBufferPointer<Element> { get }There was a problem hiding this comment.
Span.withUnsafeBufferPointer ensures that the Span's lifetime, and whatever that span depends on, extend across all valid uses of the pointer. The rules for pointer validity in closure taking methods are clear. There's nothing new here. As long as programmers follow the usual rules, they can't introduce any new form of unsafety.
The compiler should be able to diagnose pointer escapes from the closure, but that's a separate discussion!
| func withUnsafeBytes<E: Error, Result: ~Copyable & ~Escapable>( | ||
| _ body: (_ buffer: UnsafeRawBufferPointer) throws(E) -> Result | ||
| ) throws(E) -> Result | ||
| } | ||
|
|
||
| extension RawSpan { | ||
| func withUnsafeBytes<E: Error, Result: ~Copyable & ~Escapable>( | ||
| _ body: (_ buffer: UnsafeRawBufferPointer) throws(E) -> Result | ||
| ) throws(E) -> Result | ||
| } |
There was a problem hiding this comment.
Same suggestion about having a property here:
var unsafeTypes: UnsafeRawBufferPointer { get }| } | ||
| ``` | ||
|
|
||
| ##### Accessing subranges of elements: |
There was a problem hiding this comment.
As I noted earlier on, all of the extracting functions depend on lifetime dependencies. Perhaps provide a closure-based withSubspan?
| /// - Parameters: | ||
| /// - type: The type as which to view the bytes of this span. | ||
| /// - Returns: A typed span viewing these bytes as instances of `T`. | ||
| public func unsafeView<T: BitwiseCopyable>(as type: T.Type) -> Span<T> |
There was a problem hiding this comment.
IIUC, this requires lifetime dependencies. Change it to a closure-taking withUnsafeView?
| /// The closure's parameter is valid only for the duration of | ||
| /// its execution. | ||
| /// - Returns: The return value of the `body` closure parameter. | ||
| func withUnsafeBytes<E: Error, Result: ~Copyable & ~Escapable>( |
There was a problem hiding this comment.
As noted earlier, can this be a property?
| ```swift | ||
| extension RawSpan { | ||
|
|
||
| public func extracting(_ bounds: Range<Int>) -> Self |
There was a problem hiding this comment.
Same issue as earlier; this is based on lifetime dependencies.
|
|
||
| Some types store their internal representation in a piecewise-contiguous manner, such as trees and ropes. Some operations naturally return information in a piecewise-contiguous manner, such as network operations. These could supply results by iterating through a list of contiguous chunks of memory. | ||
|
|
||
| #### <a name="MutableSpan"></a>Safe mutations of memory with `MutableSpan<T>` |
There was a problem hiding this comment.
Looking forward to this proposal! ;)
| public func extracting(droppingFirst k: Int) -> Self | ||
| /// - Parameters: | ||
| /// - indices: a range of indices to validate | ||
| public func boundsPrecondition(_ indices: Range<Int>) |
There was a problem hiding this comment.
span.boundsPrecondition(range) is just precondition(span.boundsContain(range)), right? Do we really need/want sugar for this?
Re: naming, I think things get somewhat easier if we drop the precondition func, then we just have boundsContain, whose naming isn't great because it's kind of oblique. The question we're really asking is "are the indices in this range valid for this span?". Something like span.indicesContain(range), though that then raises the question of "why isn't this just span.indices.contains(range)?" (probably performance?)
There was a problem hiding this comment.
If we have these, do we also need ClosedRange versions?
There was a problem hiding this comment.
Yes, boundsPrecondition's implementation is precondition(boundsContain(range)). There's maybe an argument to be made that the sugar hurts here, since it doesn't capture the correct line of code in the defaulted arguments of precondition().
I haven't needed the ClosedRange versions because the RangeExpression-taking functions end up forwarding to the usual Range-taking functions. Maybe we would want them. We could take out the sugared preconditions and add ClosedRange bounds-checking functions.
DougGregor
left a comment
There was a problem hiding this comment.
This is looking great to me, thank you
| public func extracting(droppingFirst k: Int) -> Self | ||
| /// - Parameters: | ||
| /// - indices: a range of indices to validate | ||
| public func boundsPrecondition(_ indices: Range<Int>) |
There was a problem hiding this comment.
If we have these, do we also need ClosedRange versions?
| public func withSpan<E: Error, Result: ~Copyable>( | ||
| _ body: (_ elements: Span<Element>) throws(E) -> Result | ||
| ) throws(E) -> Result |
There was a problem hiding this comment.
I'm one of those people who would like this API :)
There was a problem hiding this comment.
The complex semantics of this makes it undesirable, as exhibited by the overcomplicated signature. What we actually need is a storage property that directly returns a Span, tied to the lifetime of self.
| ```swift | ||
| @frozen | ||
| public struct Span<Element: ~Copyable & ~Escapable>: Copyable, ~Escapable { | ||
| internal var _start: UnsafePointer<Element> |
There was a problem hiding this comment.
Span has the wrong pointer type: UnsafePointer<Element>. The underlying pointer should be UnsafeRawPointer. An important goal of Span is that it can be used to reinterpret memory as different BitwiseCopyable elements. It needs to be possible to convert Span<T: BitwiseCopyable> to Span<U: BitwiseCopyable>. That should require an explicit unsafe API but should still be well-defined. If you do this conversion with an underlying typed pointer, you get undefined behavior. We want multiple read-only spans to safely view the same underlying memory as different types. We also want to be able to view slices of a RawSpan as a particular element type. This is the same problem. Note that RawSpan does not provide any additional functionality; it is only a convenience for buffers of heterogenous elements, where declaring an element type might be misleading at the API level.
We do not want to describe strict aliasing rules and undefined behavior in the Span API!
There was a problem hiding this comment.
Hm; unrelated to the above: didn't we intend to use buffer-pointer layout for this? The start pointer ought to be optional.
There was a problem hiding this comment.
I've updated this to be optional, with an explanation (hadn't noticed that a push failed on Friday.) We won't use buffer-pointer layout after all, because we do want the pointer to be raw.
| extension RawSpan { | ||
| public init<T: BitwiseCopyable>(_ span: Span<T>) | ||
|
|
||
| public func unsafeView<T: BitwiseCopyable>(as type: T.Type) -> Span<T> |
There was a problem hiding this comment.
Since you're mentioning RawSpan.unsafeView<T: BitwiseCopyable>(as type: T.Type) -> Span<T> here, we may also want to explain that this is a safe API as long as the memory contains a value bit pattern for T at each position. Span<T> does not impose any strict aliasing demands on the programmer; they can use multiple read-only spans to safely view the same underlying memory as different types.
| public var count: Int { get } | ||
| public var isEmpty: Bool { get } | ||
|
|
||
| public subscript(_ position: Int) -> Element { _read } |
There was a problem hiding this comment.
public subscript(_ position: Int) -> Element { _read }
This is not, in fact, the API that we want long term; it won't be compatible with lifetime-dependent values that are derived from the span.
We really want some new spelling for ~Copyable projections, like:
public subscript(_ position: Int) -> Element { borrow }
For now, we could say:
public subscript(_ position: Int) -> Element { unsafeAddress }
Either way, we end up with an extra mangling for this API that may become vestigial in the future. But with unsafeAddress, at least we have the full functionality of some future, safe "borrow accessor".
@jckarter what do you think?
There was a problem hiding this comment.
The future directions attempts to explain that the _read accessor is a placeholder. I expect the implementation of such placeholders to be non-ABI until there is a final spelling. I could add a reference to the future directions item after this.
There was a problem hiding this comment.
Ok. I realize the plan is to initially implement Span.subscript via _read. I'll just point out that it will limit usability. unsafeAddress will at least support the same use cases that a borrow accessor would support.
- add `boundsContain(_ bounds: ClosedRange<Int>)`
| ```swift | ||
| @frozen | ||
| public struct Span<Element: ~Copyable & ~Escapable>: Copyable, ~Escapable { | ||
| internal var _start: UnsafePointer<Element> |
There was a problem hiding this comment.
Hm; unrelated to the above: didn't we intend to use buffer-pointer layout for this? The start pointer ought to be optional.
| /// - Parameters: | ||
| /// - index: an index to validate | ||
| /// - Returns: true if `index` is a valid index | ||
| public func boundsContain(_ index: Int) -> Bool |
There was a problem hiding this comment.
This name needs to be reconsidered. isValidIndex(_:)
| /// Parameters: | ||
| /// - span: a span of the same type as `self` | ||
| /// Returns: whether `span` is a subrange of `self` | ||
| public func contains(_ span: borrowing Self) -> Bool |
There was a problem hiding this comment.
Unfortunately, we have blown the contains name to mean elementwise containment, not structural containment. This usage is not compatible with that, so we need to use a different name for it.
isSuperspan(of:)? isSubspan(of:)? isWider(than:)? 😕
There was a problem hiding this comment.
[Notes from in person discussions]
-
span.isWithin(other)appears to be a workable spelling for this. -
We also need to have an
isIdentical(to:)method, to check equivalence. (a.isIdentical(to: b) == a.isWithin(b) && b.isWithin(a), but we wouldn't want to spell/implement it that way.) -
If the elements are bitwise copyable, then this operation needs to check alignment, which is a surprising complication, explaining why we'd want to ship this functionality in the stdlib. (This also needs to be called out within the docs.)
This may also turn into a solid argument against using the
isWithinname, especially when considering the equivalent APIs onRangein point 5 below. (If so, perhaps something likeisSubspanis the way to go.) -
We should add equivalent members to the
UnsafeBufferPointertypes, to keep the stdlib coherent with itself, and to resolve some long standing omissions of basic functionality.extension Unsafe[Mutable]BufferPointer { func isIdentical(to other: UnsafeBufferPointer<Element>) -> Bool func isIdentical(to other: UnsafeMutableBufferPointer<Element>) -> Bool func isWithin(_ other: UnsafeBufferPointer<Element>) -> Bool func isWithin(_ other: UnsafeMutableBufferPointer<Element>) -> Bool }
-
isWithinought to also be added to the range types:extension Range { func isWithin(other: Range) -> Bool func isWithin(other: ClosedRange) -> Bool } extension ClosedRange { func isWithin(other: Range) -> Bool func isWithin(other: ClosedRange) -> Bool }
(These types already provide the functionality of
isIdenticalin their==operation.)The notion of unaligned bounds makes this somewhat dangerous --
Range<UnsafePointer<T>>.isWithin(_:)would sometimes return different results than anisWithincall onUnsafeBufferPointerorSpanvalues over the same region of memory. -
The preexisting
[Closed]Range.overlapshints at a need to provide[Raw]Span.overlapsandUnsafe[Mutable][Raw]BufferPointer.overlapsoperations. (As well asintersectandunion/merge/combineoperations.) This does not seem to be an urgent need though.
| /// This is an unsafe operation. Failure to meet the preconditions | ||
| /// above may produce an invalid value of `T`. |
There was a problem hiding this comment.
When the address has incorrect alignment, are we really going to return an invalid value? (Rather than, say, sometimes causing a fault?)
There was a problem hiding this comment.
There exist systems where a load instruction simply masks off the low bits and always loads whatever it finds at the resulting aligned address (AltiVec loads behave this way on PPC, for example).
There was a problem hiding this comment.
Unaligned loads can fault in embedded world as well, especially when targeting memory that isn't DRAM. In a sense, the worst outcome isn't necessarily an invalid value of T.
There was a problem hiding this comment.
The point of this is that there is no trap other than bounds checking, and the T instance's memory could be in a state that does not respect T's invariants.
| public func contains(_ span: borrowing Self) -> Bool | ||
|
|
||
| public func offsets(of span: borrowing Self) -> Range<Int> |
There was a problem hiding this comment.
Same problems as with Span -- contains implements the wrong semantics for that name, and offsets(of:) has an awkward name/shape.
There was a problem hiding this comment.
public func isWithin(_ other: borrowing Self) -> Bool // Perhaps removed
public func byteOffsets(of subspan: borrowing Self) -> Range<Int>?
| public func withSpan<E: Error, Result: ~Copyable>( | ||
| _ body: (_ elements: Span<Element>) throws(E) -> Result | ||
| ) throws(E) -> Result |
There was a problem hiding this comment.
The complex semantics of this makes it undesirable, as exhibited by the overcomplicated signature. What we actually need is a storage property that directly returns a Span, tied to the lifetime of self.
|
|
||
| ```swift | ||
| @frozen | ||
| public struct Span<Element: ~Copyable & ~Escapable>: Copyable, ~Escapable { |
There was a problem hiding this comment.
Why are we proposing Span with support for nonescapable elements?
This is a proposal for safe access to borrowed memory, adding
Span(formerly known as "BufferView") andRawSpan.This proposal is a companion to (and depends on) Non-Escapable Types and on Lifetime Dependency Annotations for Non-escapable Types.