From 449f447e5d7c376da4993b4c04a904c79a6cc5ba Mon Sep 17 00:00:00 2001 From: Benoit Vey Date: Fri, 27 Jan 2017 05:16:28 +0100 Subject: [PATCH 1/4] Stateful exceptions draft --- text/0000-stateful-exceptions.md | 70 ++++++++++++++++++++++++++++++++ 1 file changed, 70 insertions(+) create mode 100644 text/0000-stateful-exceptions.md diff --git a/text/0000-stateful-exceptions.md b/text/0000-stateful-exceptions.md new file mode 100644 index 00000000..fffb388a --- /dev/null +++ b/text/0000-stateful-exceptions.md @@ -0,0 +1,70 @@ +- Feature Name: stateful-exceptions +- Start Date: 2017-01-27 +- RFC PR: +- Pony Issue: + +# Summary + +Allow exceptions to carry a value from the error site to the handling site, while keeping the exception handling static. The value or its type aren't involved in destination checking. Exceptions will still land at the first handler encountered, with negligible additional runtime cost. + +# Motivation + +We currently have two very distinct idioms to handle errors in the language and standard library. + +1. Exceptions. They are used most of the time but since they are valueless, they can't be used to propagate the reason of an error to a caller function. +2. Union types of the normal result and the error reason(s). This is used when the reason of an error is needed by a caller function (e.g. the constructors of `File` in the standard library). + +Number 2 has several drawbacks. In particular + +- The type of the result must be asserted via pattern matching, which introduces a runtime cost even in the non-erroring cases +- The error condition must be propagated manually through every calling function, unlike a "fire and forget" exception + +In addition, having two different ways of doing almost the same thing isn't good for the overall consistency of the language and libraries. + +Having this feature would also make the exception system a lot more versatile. By default it will still permit fast and static exception handling, while allowing programmers to manually implement dynamic handling systems akin to "traditional" languages like Java. + +# Detailed design + +## The raising part + +`error` will now take an optional expression as its right-hand side. This expression will be the value passed to the exception handler (i.e. the `else` of a `try` expression). The error value must be a subtype of `Any val`. The reasons for this are: + +- The value must have a type on the handling side. `Any` is a natural choice here (because exception specifications are still static) and `val` is a compromise between the broad and not-so-useful `tag` and the very restrictive `iso`. This will allow erroring with primitives, `String`s, etc. +- This can't violate any capability boundary and avoids additional heavy checks. + +An `error` with no value implicitly defaults to `error None`. + +## The handling part + +A new special value, `current_error`, will be accessible in the `else` branch of `try` expressions. This `current_error` will be an alias of the `error`ed value. Its type is `Any val`, the real type can be established through pattern matching to use the original value. `current_error` always references the value handled by the closest `else` block (i.e. the most nested one). + +This new mechanism doesn't change anything to the actual exception handling. Exceptions still stop at the first handler encountered. + +## Implementation and performance concerns + +This change can be implemented with very little overhead. The cost roughly is an additional argument to a runtime function call, an additional write to memory (when raising) and an additional read from memory (when beginning handling). These operations are negligible compared to the overall cost of raising an exception. + +A proof-of-concept implementation can be found [here](https://github.com/Praetonus/ponyc/tree/stateful-exceptions) (untested on Windows). + +# How We Teach This + +We'll update the tutorial section on exceptions to explain how to raise an error with a value and how to process that value in the handler. + +We could also do a Pony Pattern explaining how to emulate a dynamic exception system on the user side through pattern matching and successive re-raising of errors. + +# How We Test This + +We'll add some type checking tests to ensure type validity on both the raising and the handling side. While we're currently lacking that functionnality in the test frameworks, having tests ensuring that the value is propagated correctly would be good. These tests would have to wait until we can run Pony code in JIT through the compiler and tests. + +# Drawbacks + +`current_error` would become a reserved identifier. Otherwise, backwards compatibility is fully maintained. + +# Alternatives + +- Implement a full-blown dynamic exception system. This would be a really important performance hit on most programs, while not having many advantages over the proposed system. +- Keep things as is. This would leave the concerns raised in Motivation unresolved. + +# Unresolved questions + +None. From 42ec1931f7637463b24b0a5f9daa1f5d52219b7e Mon Sep 17 00:00:00 2001 From: Benoit Vey Date: Mon, 30 Jan 2017 21:00:37 +0100 Subject: [PATCH 2/4] Amend stateful exceptions - `elsematch` semantics - Conventions for the Pony Pattern --- text/0000-stateful-exceptions.md | 28 +++++++++++++++++++++++----- 1 file changed, 23 insertions(+), 5 deletions(-) diff --git a/text/0000-stateful-exceptions.md b/text/0000-stateful-exceptions.md index fffb388a..21f21a54 100644 --- a/text/0000-stateful-exceptions.md +++ b/text/0000-stateful-exceptions.md @@ -9,17 +9,18 @@ Allow exceptions to carry a value from the error site to the handling site, whil # Motivation -We currently have two very distinct idioms to handle errors in the language and standard library. +We currently have several very distinct idioms to handle errors in the language and standard library. 1. Exceptions. They are used most of the time but since they are valueless, they can't be used to propagate the reason of an error to a caller function. 2. Union types of the normal result and the error reason(s). This is used when the reason of an error is needed by a caller function (e.g. the constructors of `File` in the standard library). +3. Notifier objects that are passed-in and invoked for any errors encountered. This idiom is mostly used in actor-based asynchronous code and won't be covered here, as this RFC covers synchronous error handling. Number 2 has several drawbacks. In particular - The type of the result must be asserted via pattern matching, which introduces a runtime cost even in the non-erroring cases - The error condition must be propagated manually through every calling function, unlike a "fire and forget" exception -In addition, having two different ways of doing almost the same thing isn't good for the overall consistency of the language and libraries. +In addition, having several different ways of doing almost the same thing isn't good for the overall consistency of the language and libraries. Having this feature would also make the exception system a lot more versatile. By default it will still permit fast and static exception handling, while allowing programmers to manually implement dynamic handling systems akin to "traditional" languages like Java. @@ -36,7 +37,21 @@ An `error` with no value implicitly defaults to `error None`. ## The handling part -A new special value, `current_error`, will be accessible in the `else` branch of `try` expressions. This `current_error` will be an alias of the `error`ed value. Its type is `Any val`, the real type can be established through pattern matching to use the original value. `current_error` always references the value handled by the closest `else` block (i.e. the most nested one). +A new type of `else` clause would be available to `try` expressions, the `elsematch` clause. An `elsematch` is syntactically equivalent to a standalone `match` but has no match expression and instead implicitly matches on the `error`ed value with an `Any val` type. The cases and `else` clause behave exactly like a standalone `match`. In addition, the `elsematch` can have an `elseerror` clause instead of an `else` clause. This clause "re-raises" the `error`ed value if no case matched. + +The `try` expression itself can also have an `elseerror` clause instead of an `else` or `elsematch` clause. This is useful if the `try` expression also has a `then` clause, to do some cleanup but delay the actual error handling. + +Example of the new syntax: + +```pony +try + partial_function() +elsematch +| ErrorType1 => foo() +| ErrorType2 => bar() +elseerror +end +``` This new mechanism doesn't change anything to the actual exception handling. Exceptions still stop at the first handler encountered. @@ -50,7 +65,10 @@ A proof-of-concept implementation can be found [here](https://github.com/Praeton We'll update the tutorial section on exceptions to explain how to raise an error with a value and how to process that value in the handler. -We could also do a Pony Pattern explaining how to emulate a dynamic exception system on the user side through pattern matching and successive re-raising of errors. +We could also do a Pony Pattern explaining how to emulate a dynamic exception system on the user side through pattern matching and successive re-raising of errors. It would also advocate for some conventions regarding the structure of the error value itself. Two conventions would be explained. + +- For simple cases where the error condition can be fully described by a simple type, the error value should be a primitive. This avoids dynamic allocation and code bloat due to object initialisation. This idiom should be used through most of the standard library as most functionalities have only one way of `error`ing. +- For more complex cases, a custom `class` should be used. That `class` should contain a `SourceLoc` field initialised to `__loc` by the constructor, as well as any additional field needed to carry the error information. The `SourceLoc` field would be useful to get precise information about the error source, e.g. for logging or debugging. # How We Test This @@ -58,7 +76,7 @@ We'll add some type checking tests to ensure type validity on both the raising a # Drawbacks -`current_error` would become a reserved identifier. Otherwise, backwards compatibility is fully maintained. +`elsematch` and `elseerror` would become reserved keywords. Otherwise, backwards compatibility is fully maintained. # Alternatives From 6169dc95b262e56e66fa3d4f2b86cb26f64f492e Mon Sep 17 00:00:00 2001 From: Benoit Vey Date: Wed, 22 Feb 2017 01:47:02 +0100 Subject: [PATCH 3/4] Amend stateful exceptions --- text/0000-stateful-exceptions.md | 50 ++++++++++++++++++++++++-------- 1 file changed, 38 insertions(+), 12 deletions(-) diff --git a/text/0000-stateful-exceptions.md b/text/0000-stateful-exceptions.md index 21f21a54..c156c9af 100644 --- a/text/0000-stateful-exceptions.md +++ b/text/0000-stateful-exceptions.md @@ -13,31 +13,51 @@ We currently have several very distinct idioms to handle errors in the language 1. Exceptions. They are used most of the time but since they are valueless, they can't be used to propagate the reason of an error to a caller function. 2. Union types of the normal result and the error reason(s). This is used when the reason of an error is needed by a caller function (e.g. the constructors of `File` in the standard library). -3. Notifier objects that are passed-in and invoked for any errors encountered. This idiom is mostly used in actor-based asynchronous code and won't be covered here, as this RFC covers synchronous error handling. +3. Notifier objects that are passed-in and invoked for any errors encountered. This is mostly used in asynchronous code but can also be used for local error handling recovery, for example to retry a failed operation with alternative arguments. Number 2 has several drawbacks. In particular - The type of the result must be asserted via pattern matching, which introduces a runtime cost even in the non-erroring cases - The error condition must be propagated manually through every calling function, unlike a "fire and forget" exception -In addition, having several different ways of doing almost the same thing isn't good for the overall consistency of the language and libraries. +Number 3 doesn't have these issues but is difficult to use when the error is logically unrecoverable. For example, if one wants to open file `A` and use `stdout` if `A` isn't available, the notifier pattern can be used to default to `stdout` if opening `A` fails: the error is fully recoverable. But if the only option is to use file `A`, the error will be unrecoverable at some point of the call stack. An exception (or an error code, with the above inconvenients) must be used to skip that failed part. -Having this feature would also make the exception system a lot more versatile. By default it will still permit fast and static exception handling, while allowing programmers to manually implement dynamic handling systems akin to "traditional" languages like Java. +Having this feature would also make the exception system a lot more versatile. By default it will still permit fast and static exception handling, while allowing programmers to manually implement dynamic handling systems akin to "traditional" languages like Java. That said, this last pattern shouldn't be the default. We should still advocate for functions with "one way to fail", and handling as close as possible to the error site. "One way to fail" here means that the method being called should intrinsically describe why it would error. The type of an error should only provide details on the reason of an error, and not carry all of the information itself. For example, a function opening a file would error when failing to open the file, with an error type describing why the file couldn't be opened. # Detailed design -## The raising part +## Exception specifications + +The signature of an erroring method will now specify the type that the method can possibly error with after the `?` symbol. For example: + +```pony +fun foo(): ReturnType ? ErrorType +``` + +`ErrorType` is optional and defaults to `None` (which doesn't mean that the method cannot error, but that it can only error with type `None`). + +A method that wants to error with different types can use a type union as its `ErrorType`. The only constraint on `ErrorType` is that it must be a subtype of `Any val`. This is to avoid complex issues with reference capabilities, for example with automatic receiver recovery on method calls. In subtyping relationships, error types are covariant (i.e. `{() ? A}` is a subtype of `{() ? B}` if `A` is a subtype of `B`). + +### "Checked exception hell" concerns -`error` will now take an optional expression as its right-hand side. This expression will be the value passed to the exception handler (i.e. the `else` of a `try` expression). The error value must be a subtype of `Any val`. The reasons for this are: +It can be argued that exception specifications introduce a lot of complexity. Namely, the following arguments are often raised: -- The value must have a type on the handling side. `Any` is a natural choice here (because exception specifications are still static) and `val` is a compromise between the broad and not-so-useful `tag` and the very restrictive `iso`. This will allow erroring with primitives, `String`s, etc. -- This can't violate any capability boundary and avoids additional heavy checks. +- Extending the exception specification of a method with a new type can cause a massive refactoring, where a new handling clause must be added everywhere the method is called. +- Exception specifications reinforce coupling. When an exception can propagate up the call stack, the associated exception specification must also be present on every method up to the handling point. + +While these concerns are in some part intrinsic to checked exceptions, we'll argue here that they are amplified by bad API design and that better design, namely strict conformance to "one way to fail", can highly reduce the burden. + +Having "one way to fail" means that there is always a default action to take, regardless of the details of the error. For example, if opening a file fails, printing a generic error message is a valid handling whether the file didn't exist or the user didn't have permission on it. Therefore, extending an exception signature means adding precisions about the error reason instead of adding new error reasons. This doesn't render existing handlers incorrect (unless they don't have a default case) and modifying them shouldn't be fundamentally necessary. + +## The raising part + +`error` will now take an optional expression as its right-hand side. This expression will be the value passed to the exception handler (i.e. the `else` of a `try` expression). The error value must be a subtype of the method's error type. An `error` with no value implicitly defaults to `error None`. ## The handling part -A new type of `else` clause would be available to `try` expressions, the `elsematch` clause. An `elsematch` is syntactically equivalent to a standalone `match` but has no match expression and instead implicitly matches on the `error`ed value with an `Any val` type. The cases and `else` clause behave exactly like a standalone `match`. In addition, the `elsematch` can have an `elseerror` clause instead of an `else` clause. This clause "re-raises" the `error`ed value if no case matched. +A new type of `else` clause would be available to `try` expressions, the `elsematch` clause. An `elsematch` is syntactically equivalent to a standalone `match` but has no match expression and instead implicitly matches on the `error`ed value, with the match type being the union of all the types that can be errored with within the `try`. The cases and `else` clause behave exactly like a standalone `match`. In addition, the `elsematch` can have an `elseerror` clause instead of an `else` clause. This clause "re-raises" the `error`ed value if no case matched. The `try` expression itself can also have an `elseerror` clause instead of an `else` or `elsematch` clause. This is useful if the `try` expression also has a `then` clause, to do some cleanup but delay the actual error handling. @@ -53,26 +73,32 @@ elseerror end ``` +Once exhaustive pattern matching is implemented, `elsematch` should cause a compile error when an exception type isn't handled, or when a handler is unreachable. + This new mechanism doesn't change anything to the actual exception handling. Exceptions still stop at the first handler encountered. ## Implementation and performance concerns This change can be implemented with very little overhead. The cost roughly is an additional argument to a runtime function call, an additional write to memory (when raising) and an additional read from memory (when beginning handling). These operations are negligible compared to the overall cost of raising an exception. -A proof-of-concept implementation can be found [here](https://github.com/Praetonus/ponyc/tree/stateful-exceptions) (untested on Windows). +A proof-of-concept implementation can be found [here](https://github.com/Praetonus/ponyc/tree/stateful-exceptions) (untested on Windows). As a demonstration of the system, the `files` package was modified to use stateful exceptions. These changes aren't part of the RFC. # How We Teach This We'll update the tutorial section on exceptions to explain how to raise an error with a value and how to process that value in the handler. -We could also do a Pony Pattern explaining how to emulate a dynamic exception system on the user side through pattern matching and successive re-raising of errors. It would also advocate for some conventions regarding the structure of the error value itself. Two conventions would be explained. +Several conventions would have to be refined and explained. + +For libraries and small applications, we should encourage users to follow "one way to fail". It is also important that we enforce strict compliance to "one way to fail" in the standard library in order to keep it modular and easy to use. + +For complex application that could need such a system, we could do a Pony Pattern explaining how to emulate a dynamic exception system on the user side through pattern matching and successive re-raising of errors. It would also advocate for some conventions regarding the structure of the error value itself. Two conventions would be explained. -- For simple cases where the error condition can be fully described by a simple type, the error value should be a primitive. This avoids dynamic allocation and code bloat due to object initialisation. This idiom should be used through most of the standard library as most functionalities have only one way of `error`ing. +- For simple cases where the error condition can be fully described by a simple type, the error value should be a primitive. This avoids dynamic allocation and code bloat due to object initialisation. - For more complex cases, a custom `class` should be used. That `class` should contain a `SourceLoc` field initialised to `__loc` by the constructor, as well as any additional field needed to carry the error information. The `SourceLoc` field would be useful to get precise information about the error source, e.g. for logging or debugging. # How We Test This -We'll add some type checking tests to ensure type validity on both the raising and the handling side. While we're currently lacking that functionnality in the test frameworks, having tests ensuring that the value is propagated correctly would be good. These tests would have to wait until we can run Pony code in JIT through the compiler and tests. +We'll add some type checking tests to ensure type validity on both the raising and the handling side, as well as tests ensuring that the value is propagated correctly (with JIT tests). # Drawbacks From afcb54ea9035847131301ea9f5e7815bd7865e8e Mon Sep 17 00:00:00 2001 From: Benoit Vey Date: Wed, 22 Feb 2017 21:04:22 +0100 Subject: [PATCH 4/4] Amend stateful exceptions --- text/0000-stateful-exceptions.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/text/0000-stateful-exceptions.md b/text/0000-stateful-exceptions.md index c156c9af..55e7a48d 100644 --- a/text/0000-stateful-exceptions.md +++ b/text/0000-stateful-exceptions.md @@ -77,6 +77,8 @@ Once exhaustive pattern matching is implemented, `elsematch` should cause a comp This new mechanism doesn't change anything to the actual exception handling. Exceptions still stop at the first handler encountered. +Also, the `then` clause of a `try` expression won't be able to raise exceptions anymore. This is to prevent existing exceptions from being silently discarded when doing cleanup. + ## Implementation and performance concerns This change can be implemented with very little overhead. The cost roughly is an additional argument to a runtime function call, an additional write to memory (when raising) and an additional read from memory (when beginning handling). These operations are negligible compared to the overall cost of raising an exception.