Work in progress

1. Abstract

Julia currently produces short and ambiguous error messages that are prone to confusing new and experienced users, as many other languages do. We can, and should, do better though.

With the new StyledString capabilities, we are better placed than ever before to take inspiration from languages that stand out from the rest in producing helpful error messages. What’s needed now is less a major technical change, but a major focused effort to actually improve Julia’s error messages.

2. Status Quo

Julia’s error messages currently effectively communicate that an error has occurred, and the general nature of it, but fail to do much beyond that.

If you look in any of the Julia spaces with new users, you can find an ample supply of messages which demonstrate this. Here are some examples of error messages mentioned in # New to Julia on Discourse that confused people:

#
julia> for i in [1:3;], for j in [4:6;]
          println(i, " ", j)
       end
ERROR: ParseError:
# Error @ REPL[1]:3:4
   println(i, " ", j)
end
#  └ ── invalid iteration spec: expected one of `=` `in` or `∈`
#
(@v1.6) pkg> activate .
  Activating environment at `C:\Users\aal\scp_traj_opt\scp_new_problem\Project.toml`
(scp_new_problem) pkg> dev ../scp_traj_opt/solvers/
ERROR: Unable to parse `../scp_traj_opt/solvers/` as a package.
#
ERROR: ArgumentError: Invalid Status
#
ERROR: syntax: “\” is not a unary operator
#
julia> sum(1,2)
ERROR: MethodError: objects of type Int64 are not callable
#
ERROR: UndefVarError: `Object1` not defined

In each of these examples, enough information is given that somebody already familiar with the type of error might know where to start looking, but not enough for a newcomer. Even in the “highly experienced user” case, more information would help identify and resolve the root cause faster.

There are already other areas of the Julia error system being eyed up for improvement. I’m particularly interested in the talk of a condition-system style approach mentioned in #7026 julep: “jain of custody” error handling which mentions David Moon’s thoughts in Lunar, and the more recent conversation about asynchronous exceptions in #52291. I’d love to see these thoughts translated into improvements in Julia’s error handling, but regardless of how they proceed, an improvement in the display of error messages should be able to occur independently.

3. An interesting paper on errors as classifiers

An important question to ask around error messages, is what do we want them to achieve? They indicate an error is occured, but you can get that just from a non-zero exit code. They might tell you a specific kind of error has occured, but why is that useful?

This question is given some serious thought in the whitepaper Error Messages Are Classifiers: A Process to Design and Evaluate Error Messages by John Wrenn and Shriram Krishnamurthi, which provides an intriguing (and I found convincing) perspective on error messages. Wrenn and Krishnamurthi suggest that error reports serve as a means of providing the relevant information needed to correct the error. In producing the error message, given all the information available (or synthesisable) we need to classify each piece of information as relevant/helpful or not. Consider a user viewing the error message, and the information seen (and unseen) that helped them solve the error. Through this lense, we can evaluate the earlier classification, calculating precision and recall for example.

To actually apply this model, we need to consider the nature of an error. Fundamentally, a runtime error occurs when some values violates certain constraints, with the values and constraints both stem from an underlying program. In the paper this is represented like so:

juliaerr-paper-error-model.svg

We identify relevant terms in the code where the error occurs (elements of the AST), constraints, values, and relationships between all of these, to obtain a list of available information to be classified as relevant or irrelevant. This is what allows us to calculate classifier metrics such as precision and recall.

\[
  \operatorname{Precision} = \frac{\abs{\text{relevant} \cap \text{selected}}}{\abs{\text{selected}}}
  \qquad
  \operatorname{Recall} = \frac{\abs{\text{relevant} \cap \text{selected}}}{\abs{\text{relevant}}}
\]

See the paper this methodology being applied to various examples.

To see it in practice, visit https://code.pyret.org/editor.

4. The grass is greener on the Elm/Rust side

4.1. Elm

In the middle of 2015, Elm started dramatically improving their compiler errors. Let’s look at two examples.

4.1.1. Example error one

#
-- NAMING ERROR --

Cannot find variable `List.nap`.

6|    div [] (List.nap viewUser users)

`List` does not expose `nap`. Maybe you want one of the following?

    List.map
    List.any
    List.map2
    List.map3

This tells us:

  • A name-type issue has occurred
  • The issue relates to the variable List.nap
  • The problem is that nap is not known
  • A list of names we might have intended instead
  • The line (but not column) on which the error occurs

This can be contrasted to a similar error in Julia

#
ERROR: UndefVarError: `nap` not defined
Stacktrace:
 [1] getproperty(x::Module, f::Symbol)
   @ Base ./Base.jl:31
 [2] top-level scope
   @ REPL[1]:1

Here, we are told:

  • There is an issue with the variable nap
  • A reference to the line this error comes from

Julia could provide all of the extra information that Elm has here, but currently it doesn’t.

4.1.2. Example error two

#
-- TYPE MISMATCH --

The 3rd element of this list is an unexpected type of value.

15|   [ alice, bob, "/users/chuck/pic" ]

All elements should be the same type of value so that we can iterate over the
list without running into unexpected values.

As I infer the type of values flowing through your program, I see a conflict
between these two types:

    Html

    String

4.2. Rust

#
5 |     let scores = inputs().iter().map(|(a, b)| {
  |                  ^^^^^^^^ creates a temporary which is freed while still in use
6 |         a + b
7 |     });
  |       - temporary value is freed at the end of this statement
8 |     println!("{}", scores.sum::<i32>());
  |                    ------ borrow later used here
  help: consider using a `let` binding to create a longer lived value
  |
5 ~     let binding = inputs();
6 ~     let scores = binding.iter().map(|(a, b)| {
  |

For more information about this error, try `rustc --explain E0716`.

A key extra element of Rust’s error message compared to Elm’s that I’d like to draw attention to is the use of an error code E0716. The use of error codes provides a great way to have an elaboration on the kind of issue occurring (useful to users inexperienced with the particular kind of error), without flooding more familiar users with superfluous information (improving the recall score of the error message under Wrenn and Krishnamurthi’s model).

5. A plan for Julia

5.1. Thoroughly embracing custom error types

I’m far from the first to want to improve Julia’s error messages. The LazyString type was introduced in late 2019. The Make Julia’s Error Codes Even Better than Elm’s discourse thread was started in late 2022.

The issue with error message generation that LazyString solves can also be solved by introducing new error types with custom showerror methods. This also provides the advantage of allowing for improved programmatic inspection/handling of errors. Base currently has ~650 instances of error("..."), which can largely be converted to use custom error types.

For simple cases, a macro to handle the boiler-plate-y aspects of this would probably be worth the convenience.

5.2. Making source location info available to showerror

Being able to show a marked-up version of the code that lead to the error is rather important here. Ideally we want a showerror(::IO, ::Exception, ::Position, ::StackTrace)-like method (where Position gives both the file and AST position) for this, however this relies on the JuliaSyntax lowering rewrite providing more granular position information.

Re-implementing these errors is half of the improvement, the other half comes from introducing a Rust-inspired split between short and long messages. We can introduce an interface for getting the error code of an exception errorcode(::Exception) -> Union{Int, Nothing}. When an integer error code is returned, we check the error elaboration list of the Module the Exception type belongs to.

5.3. Reversed, interactive stacktraces

When an error does occur, and we’re in the REPL, we currently just print out a stacktrace and the error message. Hunting down the chain of executing requires opening each file referenced in the stacktrace one at a time, which is tedious to say the least.

Instead, we can make it so you can interactively move though the stacktrace, and show the referenced line of the current stackframe in the REPL itself. We can take this opportunity to also make some of the “advanced” capabilities (such as opening a stacktrace selected by index in a text editor), and make them discoverable instead of learnable.

5.4. Bonus: Interactive corrections

If we become good at identifying “did you mean?” -type mistakes, and have the corresponding source code information, we could make it so that when in the REPL it’s possible to with a keystroke or few actually apply the correction to the source code.

This could be implemented via a method implemented per-exception type, e.g.

julia
#
sourcesuggestion(::Exception) = nothing
sourcesuggestion(err::MySpecificErr) = ...

6. Demonstration / Proof of Concept

Date: 2024-02-16

Author: Timothy

Created: 2024-04-25 Thu 09:48