Error Handling Links
This is a collection of interesting literature on the subject of error handling which I had collected to research my own blog post on the subject a while ago.
“A Philosophy of Software Design” by John Ousterhout has a significant section on error handling that I found worth a read. It takes a slightly different angle and gives some good examples on how to design programs in a way so that errors cannot happen, or only happen in the right places.
The Smalltalk-80 Blue book had the weirdest error handling mechanism: You’d call the
error:method on your own object, and the interactive system pops up a dialog. Some Smalltalk errors are recoverable in unusual ways through the interactivity of the system - you can define methods on the go if your program calls a method that doesn’t exist… :)
Common Lisp has a “Conditions” system which is similar to exceptions in that it propagates up the stack, but it inserts a layer in the middle between handling (“catching”) and signaling (“raising”) exceptions. The middle layer can define recovery strategies which the handling layer can then choose from.
I found the Common Lisp stuff in the “Practical Common Lisp” book. I cannot claim that I managed to wrap my head around that one. The chapter is online at https://gigamonkeys.com/book/beyond-exception-handling-conditions-and-restarts.html . Apparently someone wrote a full book on the Common Lisp condition system recently, but I haven’t read that. https://amazon.com/Common-Lisp-Condition-System-Mechanisms/dp/148426133X
Apart from that, the books I looked through were not very helpful. I feel that the Ousterhout approach of explaining this is a useful one, talking about ways to design programs so that the complications of error handling are reduced.
Other notable “error handling philosophies” I have only some links to…
https://python.org/dev/peps/pep-0020/ The Zen of Python (“errors should never pass silently”)
The Midori error model http://joeduffyblog.com/2016/02/07/the-error-model/ (Midori was an experimental OS developed at Microsoft)
My own research culminated in this article: https://blog.gnoack.org/post/error_handling/. In hindsight I took a much too academic and prescriptive approach there, and so it doesn’t get read much. The article might be biased towards the “deployed and monitored” kind of software.
Another thing I heard people discuss is the rule to “act on an error in only one place”, but I failed to find a canonical source for it. (It’s the rule that is most commonly violated if someone is logging the same error at multiple layers in the stack, leading to log spam.)