The Setup-Cleanup problem
This is a comparison of different programming idioms for performing cleanup work.
Common use cases where this is used include:
- Locking and unlocking synchronization locks.
- Opening and closing files.
- Canceling asynchronous operations that aren’t needed any more.
- Allocation and deallocation.
A naive solution: Cluttered cleanups
int frobnicate() {
set_up();
int err = foo();
if (err) {
clean_up(); // <-- 1.
return err;
}
if (!bar()) {
clean_up(); // <-- 2.
return ERROR_ACCESS;
}
baz();
clean_up(); // <-- 3.
return SUCCESS;
}
There are now three places where clean_up()
is being called (Code
duplication), and they are all far away from the call to set_up()
.
When the code changes a bit, it’s easy to miss one and introduce
cleanup issues. This is not a well-maintainable approach.
Linux kernel style (C)
The Linux kernel uses gotos to coordinate a function’s cleanup:
int frobnicate()
{
int rc = 0;
set_up();
rc = foo();
if (rc)
goto out;
if (!bar()) {
rc = EACCES;
goto out;
}
baz();
out:
clean_up(); // <--
return rc;
}
All function exits happening after set_up()
need to jump to the
label where the clean_up()
is done.
Note that this pattern can be nested: With multiple independent setup and cleanup steps, the function ends with a sequence of multiple cleanup operations and multiple exit labels (example).
A longer explanation is at Eli Bendersky’s website.
Resource Acquisition is Initialization (C++)
In C++, setup and cleanup is frequently bound to object lifetimes,
which are deterministic there. This is an example using
Abseil’s
absl::MutexLock
class:
void MyClass::Frobnicate() {
absl::MutexLock l(&mutex_);
Foo(); // under lock
Bar(); // under lock
Baz(); // under lock
}
The lifetime of the l
variable ends with the function scope, and
absl::MutexLock
’s constructor and destructor are locking and
unlocking the mutex.
This pattern can also be nested. C++ guarantees that variable destruction happens in the inverse order of construction.
The RAII pattern
(Wikipedia)
is also used in other languages like Rust (Drop
trait).
To a similar extent, similar functionality to RAII is also available
in C, when using the GCC-specific
__attribute__((__cleanup__(f)))
language extension. This extension lets you attach a cleanup function
to a variable scope, but the cleanup is defined in the place where the
variable is defined, not in the type.
As readers pointed out, a newer variant of C#’s using
statement
implements a similar non-nested cleanup bound to variable lifetime as
well. See the
documentation
comment by Morgens Heller Gabe below for examples.
“Defer” statement (Go)
Go has a language feature for cleanup:
func (m *MyClass) Frobnicate() {
m.mu_.Lock()
defer m.mu_.Unlock()
Foo() // under lock
Bar() // under lock
}
The defer
statement defers a function call until the current
function exits.
The Zig language has syntactically similardefer
anderrdefer
statements. Other than in Go, Zig executes the cleanup when leaving the current scope rather than when leaving the function.
“With” blocks and friends (Python, C#, Java)
Python implements a syntax which makes it easy to acquire resources during the execution of a nested block of commands.
def readfullfile(filename):
with open(filename) as f:
print("opened", filename)
return f.read()
The object returned by open()
implements the context
manager
protocol, which means that it can be used in a with
statement. When
the with
-scope exits, the cleanup functions will automatically get
invoked (e.g. to close a file).
Cleanups get invoked in all cases when execution leaves the scope, no matter whether it happens through a regular exit, an early return or an exception.
Update: Multiple readers have pointed out that both C# and Java now support equivalent scoped statements as well:
- In C# with the
using
statement (see documentation and Mogens Heller Grabe’s comment with examples below).- in Java with the try-with-resources syntax. (examples)
In all of these cases, the used object must implement a special interface with a cleanup method, such as
IDisposable
in C# andAutoCloseable
in Java.
Macros (Lisp)
Lisp macros are used similarly as Python’s with
statements, but
offer more ways to shoot yourself in the foot (Example from the book
Practical Common
Lisp
by Peter Seibel):
(with-open-file (stream "/some/file/name.txt" :direction :output)
(format stream "Some text.")) ; while file is opened
Here, with-open-file
is a macro caring about opening and closing
(spec). The user of the macro
provides a sequence of commands to be executed with the opened file
stream, e.g. (format stream "Some text.")
.
These macros are often named to start with the word with
, but there
are exceptions, e.g. save-excursion
in Emacs Lisp.
From the caller perspective, this behaves similar as Python’s with
statements, but they are usually expanded into a use of
unwind-protect
or a similar
primitive depending on the Lisp dialect (comparable to
try
…finally
).
Try-Finally blocks (Java, C#, Python, …)
try
blocks are primarily used for catching exceptions but in many
languages they offer a finally
clause as well, which is guaranteed
to execute whenever the try
scope exits, including on early returns
and exceptions.
setUp();
try {
doWork(); // between setup and cleanup
} finally {
cleanUp();
}
The construct tends to lead to deeper nesting when multiple cleanups need to be done.
Some languages have finally
-like constructs which are independent of
exceptions, like Smalltalk with
Block»ensure:,
unwind-protect
in Common
Lisp (on C2 wiki) and
dynamic-wind
in Scheme. (Scheme has the additional complication that it needs to
deal with continuations.)
Todo lists (languages with closures)
In languages with closures or higher order functions, one way to defer work for later is to save functions to be executed later in a list:
def frobnicate():
todos = []
setUp()
todos.append(cleanUp)
doWork() # between setup and cleanup
for f in todos:
f()
One variant of this is
addCleanup
in Python’s unittest
library for deferring work to be done after the
test execution. Note that similarly to defer
in Go, this lets us
place the cleanup right together with the setup code, so that we can
easily keep them in sync:
class FooTest(unittest.TestCase):
# ...
def test_service(self):
srv = StartService()
self.addCleanup(srv.Stop)
self.assertEqual(42, srv.Invoke()) # while running
Update: Since version 1.14, Go supports the
same with
T.Cleanup()
to simplify test
cleanup.
Passing higher-order functions (Ruby, Lisp?)
In Ruby, it’s common to pass down functions to other functions, as a
way of structuring control flow and also for cleanup purposes. This
invocation of
File.open
executes the passed function (in between do
…end
) with the opened
file and closes the file afterwards.
File.open("output.txt", "a") do |f|
f.puts("Hello, world!") # while file is open
end
This general technique is possible in all languages where passing down
higher order functions is cheap, such as Ruby and Smalltalk. Smalltalk
takes it to another level by expressing all control flow as method
invocations, including loops and ifTrue:
.
I’m not sure to what extent the Smalltalks use higher order functions for cleanup purposes as Ruby does. More imperative Lisps like CL expose mostly macros to users, but these will in turn use closure-passing unwinding primitives under the hood.
There is a technical twist here in that languages need special guarantees for these so-called downward funargs to be cheap.
A downward funarg is a higher-order function which is only passed down the stack but does not outlive the stack frame in which it was created. The captured variables referenced by a downward funarg can happily stay on the stack. This is in contrast to upward funargs which outlive their own stack frame and which require invisible heap allocations or similar tricks in the language.
setUp() and tearDown() in (JUnit, xUnit, …)
In the xUnit family of unit testing frameworks, tests are part of
classes which get executed by a test runner. Each test class may
optionally override setUp()
and tearDown()
hooks which get
executed around every test execution, independent of whether that test
succeeded or not.
In languages which provide a sane way of cleanup, I think that
overriding the tearDown()
hook should not be necessary and probably
discouraged. I suspect the same is true for code in setUp()
, but
that is more related to ambient shared state which becomes tricky to
modify when multiple tests rely on a subset of it in implicit ways.
Discussion
These approaches are quite different to each other. In practice, the programming language is often already fixed, and then only a limited number of options exists.
In general, I found that in practice it helps to have setup and
cleanup close to each other, so that it’s easy to keep them in sync.
Go’s defer
really shines there.
It’s even nicer to have them represented by the same construct as with
Python’s with
or C++’s RAII idiom. With these, the usage becomes
safer, but you trade that for more effort in implementing classes
that are with
- and RAII-enabled. (In Python, this is somewhat
mitigated with the
contextlib
module.)
Downward funargs are both safe, easy to use and simple to implement, but have a little overhead. That’s fair in most scripting cases, but wouldn’t work for the Linux kernel.
In terms of reasoning about performance, the Linux kernel has the approach where it’s most obvious what this translates to at the machine level, but the same amount of control is usually not needed in user space programs.
At a higher level, I believe it pays off to care about symmetric setup and cleanup steps as part of the same function wherever possible. It makes it easier to track where setups and cleanups are done across the codebase, and refactoring to such a design reduces state that has to be tracked across multiple functions.
Update (2020-01-13)
Thanks for all the positive feedback, particularly to everyone who pointed out related language features:
- C#’s
using
: Thanks, Mogens Heller Grabe for the longer explanation in the comments below, as well as the users hateful and niklasjansson on HN who pointed out documentation. - Java try-with-resources: cpt1138 and quantified on HN
- Rust’s
Drop
trait: jupp0r on HN - Zig’s
defer
anderrdefer
: apta on HN
For Haskell, lgas
mentioned the Bracket
pattern, and
dwohnitmok is
mentioning linear types. Superficially this looks similar to
dynamic-unwind
and friends in Scheme and Lisp (see above), but my
Haskell-fu is too weak to judge it. I suspect it would be hard to
compare programming idioms between Haskell’s lazy purely functional
evaluation and imperative programs.