Error Handling Links

September 10, 2021

This is a collection of interesting literature on the subject of error handling which I had collected to research my own blog post on the subject a while ago.

“A Philosophy of Software Design” by John Ousterhout has a significant section on error handling that I found worth a read. It takes a slightly different angle and gives some good examples on how to design programs in a way so that errors cannot happen, or only happen in the right places.
The Smalltalk-80 Blue book had the weirdest error handling mechanism: You’d call the error: method on your own object, and the interactive system pops up a dialog. Some Smalltalk errors are recoverable in unusual ways through the interactivity of the system - you can define methods on the go if your program calls a method that doesn’t exist… :)
Common Lisp has a “Conditions” system which is similar to exceptions in that it propagates up the stack, but it inserts a layer in the middle between handling (“catching”) and signaling (“raising”) exceptions. The middle layer can define recovery strategies which the handling layer can then choose from.

I found the Common Lisp stuff in the “Practical Common Lisp” book. I cannot claim that I managed to wrap my head around that one. The chapter is online at https://gigamonkeys.com/book/beyond-exception-handling-conditions-and-restarts.html . Apparently someone wrote a full book on the Common Lisp condition system recently, but I haven’t read that. https://amazon.com/Common-Lisp-Condition-System-Mechanisms/dp/148426133X

Apart from that, the books I looked through were not very helpful. I feel that the Ousterhout approach of explaining this is a useful one, talking about ways to design programs so that the complications of error handling are reduced.

Other notable “error handling philosophies” I have only some links to…

https://python.org/dev/peps/pep-0020/ The Zen of Python (“errors should never pass silently”)
The Midori error model http://joeduffyblog.com/2016/02/07/the-error-model/ (Midori was an experimental OS developed at Microsoft)
My own research culminated in this article: https://blog.gnoack.org/post/error_handling/. In hindsight, I took a much too academic and prescriptive approach there, and so it doesn’t get read much. The article might be biased towards the “deployed and monitored” kind of software.

Another thing I heard people discuss is the rule to “act on an error in only one place”, but I failed to find a canonical source for it. (It’s the rule that is most commonly violated if someone is logging the same error at multiple layers in the stack, leading to log spam.)

Error Handling Links

Comments