Landlock Path Walk Inversion Experiment

A (failed) performance tuning experiment

Landlock supports multiple nested sandboxes, so when an operation with filesystem path is attempted, it has to check it against up to 16 nested policies.

The way it currently does that is by constructing a matrix of the requested access rights per layer and checking off these matrix entries during the path walk. This is efficient, but it also increases code complexity and I have long had a nagging doubt about whether it is worth the tradeoff.

After all, the mental model for using Landlock is that each layer gets checked independently from the next. A more “natural” way to implement this would be to:

  1. loop over the layers first, and
  2. then do the path walk inside (multiple times).

To be clear, my confidence that this would be acceptable performance-wise was always low, but then again, performance can sometimes be counterintuitive, and so it seemed like it might be worthwhile exploring.

The refactoring would be a bigger change, but luckily we have a very comprehensive suite of kernel selftests in Landlock – so at least for the sake of doing that experiment, it was feasible to vibe code the refactoring. (The code is absolutely not meant to be submitted like that, but it is good enough to get performance numbers of the general idea.)

The experiment failed – it worsened performance noticeably. But it’s still interesting to keep a record of it.

I also found it amazing that I can play with the performance aspects of such a large refactoring without implementing it myself. :) LLMs are underused for such experiments.

Hypothesis

The (highly speculative) hypothesis of this experiment is:

By refactoring the code to loop over the layers first, and doing the path walk inside (multiple times),

  1. the code becomes simpler to reason about,
  2. and keeps reasonable performance.

Implementation

I gave the refactoring task to an LLM agent. This code is not production ready and entirely unreviewed, but it does pass the very comprehensive Landlock test suite, which remained unmodified, so I have good confidence that the relevant aspects are retained.

Measurements

Effects on Landlock code complexity

The measurement was done using:

wc -l security/landlock/*.[ch]

The lines of code in Landlock went down from 7245 (2006 in fs.c) to 6815 (1726 in fs.c) lines.

6815 after refactoring 7245 before refactoring 1726 after refactoring 2006 before refactoring Lines of Code in security/landlock/*.[ch] Lines of Code in security/landlock/fs.c

The usual caveats apply:

But it’s at least an indication that it does indeed simplify the code.

The full refactoring and benchmarking code are kept on a Git branch at https://github.com/gnoack/linux/tree/landlock-invert.

Effects on Performance

This is measured using an improved version of fs_bench.

fs_bench benchmarks how long it takes to check Landlock policies during open(), and it does so in an intentionally very-bad-case scenario, namely one where the opened file is in a deeply nested filesystem location from the root, and where the operation is in the end not permitted, so that Landlock has to walk the path all the way up to the root.

This is chosen to amplify the effect and to make it more measurable. In more practical scenarios, the number of nested Landlock domains is likely to be low (e.g., 1), and the number of nested directories is also low (e.g. <10).

The resulting performance graphs show that independent of the number of nested directories, adding more nested Landlock sandboxes does not scale well after the refactoring attempt.

Gnuplot Produced by GNUPLOT 6.0 patchlevel 4 0 10 20 30 40 50 60 70 1 2 4 8 before before after after System clocks Landlock layers fs_bench, depth = 10
Gnuplot Produced by GNUPLOT 6.0 patchlevel 4 0 100 200 300 400 500 600 700 1 2 4 8 before before after after System clocks Landlock layers fs_bench, depth = 100
Gnuplot Produced by GNUPLOT 6.0 patchlevel 4 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 1 2 4 8 before before after after System clocks Landlock layers fs_bench, depth = 1000
Gnuplot Produced by GNUPLOT 6.0 patchlevel 4 0 10000 20000 30000 40000 50000 60000 70000 80000 90000 100000 1 2 4 8 before before after after System clocks Landlock layers fs_bench, depth = 10000

Noteworthy results:

The refactoring attempt does not scale well when adding more Landlock sandboxes.

Even with a small and more realistic directory nesting depth of only 10, the performance degradation of doing the path walk twice when adding a second Landlock layer is clearly visible.

In the case of only one layer, the performance improved slightly, presumably because the code is just simpler now, but this only becomes properly measurable at a high number of nested directories. In the more realistic scenario with only 10 directories, the effect is not measurable with the clock resolution.

Directory nesting depth before refactoring (clocks) after refactoring (clocks)
10 6 6
100 32 29 (90%)
1000 469 384 (82%)
10000 6087 4676 (77%)

Conclusion

❌ The refactoring was a bad idea and it reduces performance.

The current path walk implementation with the layer matrix performs better, even in scenarios with small numbers.

Comments