64K: *That* ought to be enough for everyone. [RSS feed] [Twitter]

Never Trust Optional Safety Features

Many years ago I was a huge C++ fanboy. I have committed apostasy a long, long time. C++-11 was either still fresh or still a draft at the time. And one of the big things that changed my view is RAII.

In C++ (and, more recently, Rust) circles, RAII is touted as the solution to a long list of problems that have plagued the world of computing for a long time. Various examples are cited to show how neat it is, like mutex guards. Every time the subject of C++ in embedded systems pops up, RAII is cited as the secret sauce that makes choosing C++ a no-brainer. You'll never forget to close a stream, unlock a mutex, drop a connection or decrease a reference counter ever again.

It looks neat and easy and yet for some reason I have never seen an (embedded) C++ codebase that used RAII and did not suffer from at least one of these problems. I haven't seen enough Rust codebases over a long enough period of time to draw any conclusions, but I bet it's the same with Rust.

Why? Because RAII is optional, and optional safety features don't work for anything except small examples in tutorials.

"Extending" these tiny snippets to full codebases misfires, every single time. I've seen it countless times and it's like clockwork.

Every once in a while someone with far too charitable a view of human programming capability will come up with the neat idea of RAII-ifying everything. You got a TCP socket wrapper class? Great -- now its destructor is gonna tear down connections and everything and you'll never have to hunt for missing socket.close() statements ever again. And so on and so forth for everything else. Then they will diligently purge all that pesky cleanup boilerplate and now the code will be perfect and guaranteed to have no bugs. The runtime does it for you.

And you can take *that* to the bank animated GIF

That is, of course, until a new library comes in as a dependency, and it doesn't use RAII anywhere. Sometimes -- that is, if the deadline is far enough, and you have the source code to said library, you can usually write a RAII wrapper around it, but this is pretty finicky. Writing code that half-automatically manages local state across 60KLoC you've never read in the first place is a minefield.

Or until someone hurriedly pastes some innocent-looking boilerplate off of StackOverflow. Or introduces a bit of local state and forgets to add the associated RAII boilerplate, because by now everyone's forgotten about this kind of boilerplate.

At that point, bugs inevitably start slipping in, because you will eventually forget that "you don't need to clean up things manually" only applies to some of the code.

Three years later, everything uses RAII, except for "a few" (about 2,000 lines spread across ten files) sections that depend on vendor code. And some legacy code. And you still have to manually manage some things, like reference counts, because some of the vendor code isn't internally RAII-aware and sometimes it steps on the wrapper code's toes.

You now have the worst of both worlds. Since RAII isn't uniform, missing cleanup boilerplate can either be a bug or just clean code (because the RAII boilerplate takes care of the boring stuff). But some bits of code rely on both RAII and manual cleanup boilerplate, and you end up having to carefully check both.

Some people draw the lucky lots and never run into these problems. But make no mistake about it, it's a matter of sheer luck. You're always one weird feature request, one obtuse management decision, or one plain unlucky choice of dependency away from being gifted with a 300-class mammoth whose author went RAII, schmAII like fifteen years ago. And you will not have time to fix it, because the people paying for your software don't care how you keep your locks properly locked and unlocked as long as you give them the software on time.

And it's not always a consequence of poor decision making, either. You frequently run into trouble when dealing with systems glue layer. An embarrassing amount of the code that pushes packets around the Interwebs was written in C, in the '80s and '90s, and the only reason why it's still in use is that it's good enough 30 years later, and coming up with something that's also good enough will take another 30 years and nobody has time for that. You can't just drop the first ten Google results for "C++ RAII tutorial" on top of that and hope that the kernel absorbs it by osmosis.

(If you got here because you googled for "C++ RAII tutorial", let me tell you about the other important lesson of the day: SEO is one of the Internet's worst cancers.)

This is why Rust is so uncannily successful at dodging concurrency bugs, and why you come to trust it so much in this regard after a while. You can't avoid the borrow checker. You bend your code backwards until it fits whether you like it or not. No cowboy gets to disable it because they just know it's wrong in this particular case (of the code that they wrote the night before the big deadline, coincidentally). You never have to wonder if a dependency you've just introduced uses the borrow checker or not. No Rust codebase ever devolves into a half-borrow-checked, half-unchecked bowl of borrowed pasta.

(This isn't meant to be an endorsement of Rust. The borrow checker, like any other language feature, is a trade-off, and sometimes it's the wrong one.)

I emerged with a bunch of rules of thumb from my C++ fanboy days, and I think this one is the most valuable: as far as language features go, no feature is a safety feature if it can be avoided or turned off.