Friday, February 16, 2024

On the Insecurity of Software Bloat

Bert Hubert (via Hacker News):

The really short version: the way we build/ship software these days is mostly ridiculous, leading to 350MB packages that draw graphs, and simple products importing 1600 dependencies of unknown provenance. Software security is dire, which is a function both of the quality of the code and the sheer amount of it. Many of us know the current situation is untenable. Many programmers (and their management) sadly haven’t ever experienced anything else. And for the rest of us, we rarely get the time to do a better job.

In this post I briefly go over the terrible state of software security, and then spend some time on why it is so bad. I also mention some regulatory/legislative things going on that we might use to make software quality a priority again. Finally, I talk about an actual useful piece of software I wrote as a reality check of the idea that one can still make minimal and simple yet modern software.

I hope that this post provides some mental and moral support for suffering programmers and technologists who want to improve things. It is not just you, we are not merely suffering from nostalgia: software really is very weird today.

Niklaus Wirth:

Reducing complexity and size must be the goal in every step—in system specification, design, and in detailed programming. A programmer's competence should be judged by the ability to find simple solutions, certainly not by productivity measured in “number of lines ejected per day.” Prolific programmers contribute to certain disaster.

[…]

With Project Oberon we have demonstrated that flexible and powerful systems can be built with substantially fewer resources in less time than usual. The plague of software explosion is not a “law of nature.” It is avoidable, and it is the software engineer’s task to curtail it.

See also: Bert Hubert (via Bruce Schneier).

Previously:

Update (2024-02-20): See also: Hacker News.

17 Comments RSS · Twitter · Mastodon

I totally agree with this... but it requires excellence, and time. Management discourages this because of the bus factor and speed to market. Copilot and ChatGPT will only make things worse.

It's effectively impossible to convince a dev team that they should rather write something simple from scratch when something complex is available for free.

@Plume: that must be a younger generation thing...

Beginning to feel Emacs could be the most successful lean, alternative operating system.

@Wu Ming... that's funny. We used to call Emacs "Eight Megabytes And Constantly Swapping", back when 4Mb of RAM was a lot.

@Old Unix Geek you may have used it longer than me. Perhaps bloat grew less than RAM availability. I find it lean but also just scratched the surface of it’s countless functions.

@Wu Ming: By today's standards, it probably is. I'm just finding it amusingly ironic.

Sure, more dependencies correlate with more code and more complexity, but I feel this is a one-sided take.

>It's effectively impossible to convince a dev team that they should rather write something simple from scratch when something complex is available for free.

Depends. There are also NIH dev teams that reinvent something that already exists. They almost invariably have less time or context to think about accessibility, usability, localization, security.

I also feel a distinct “everything used to be better back in my day” vibe which is generally a distorted memory.

Yes, of course it would be better if basic apps took up less memory, and if you could run more Electron apps. (And I’m not going to defend Electron.) But that’s ignoring both the economic realities that such apps would take longer to build and you might be unwilling to foot the bill, and also that code used to be less reliable. Your optimized low-level implementation may be more resource-efficient, but also has more bugs lurking. How many people reviewed it? How many people, in contrast, review OSS dependencies? Unless they’re obscure, probably more.

@Sören: Linus' Law is not as strong as you suggest: https://en.wikipedia.org/wiki/Linus%27s_law

And despite what you said, people did write good bug-proof software in the day. (It was pretty obvious who didn't pretty quickly). You didn't get to ship "updates" over the internet, since your software was burned into physical media. ROMs required a new template to be built (and the old ROMs to be thrown away). EPROMs required erasing with UV light and then reprogramming in a special device. Same with floppies and then CDs.

Frankly, even though people today claim they only write buggy software because they are pressed for time, I don't believe them. Proper engineering requires less churn (of people and goals), smaller more competent teams, less of a "move fast and break things" attitude, and more thinking with a "I wonder about all the ways this will break" attitude. If you have the latter, you're not going to be adding code from God knows where into your project because you can't reason about something if you understand it, and most people add such dependencies to save time and don't spend the time to understand how they work (because it's very rarely not a rat's nest of further dependencies).

I'm currently typing this on a 2.7Ghz 16Gb machine, and it pauses for 2 seconds every half minute. In the old days, on a 2Mhz machine with 20Kb of RAM and no GPU, it would never pause. Of course it wasn't running a "web browser", but frankly I find it ridiculous.

So you might not like to hear that software was less buggy back in the day, but it was, because the incentives were different: "get shit right" and "make it work despite the difficult RAM/CPU constraints" was more important than "run many experiments", "get with the latest fashion", or "ship it yesterday". And the engineers who got it right were rewarded, and developed that skill, whereas now different skills are rewarded, and people develop different competencies.

"Sure, more dependencies correlate with more code and more complexity, but I feel this is a one-sided take"

It's just my experience. Go find any new software project that doesn't use either React or Angular in the frontend, and a huge stack of bullshit in the backend.

I think in some way, it's understandable. In the past, a programmer could learn one language (assembly) and understand and control pretty much 100% of what a computer did. Or, a bit later, they could learn one relatively high-level language and some OS APIs and still understand and control pretty much 100% of what a computer did.

Now everything is built upon layers and layers of leaky abstractions.

It's almost always the safest choice to say "these kinds of projects use Next.js, so we will use Next.js." So you hire a bunch of developers who have never used anything other than a high-level React-based framework which they learned in some borderline scam bootcamp, because these devs, they're relatively cheap and abundant. They'll cobble something together.

What's the other option? Pay a bunch of actual engineers 200k a year? No thanks, that looks like a terrible choice, even if the end result is better and cheaper.

>Go find any new software project that doesn't use either React or Angular in the frontend, and a huge stack of bullshit in the backend.

No, that's absolutely true. And plenty a "look at the size of that node_modules dir" joke has been made. But, leaving aside the particulars of npm… is this a problem? Was it a problem when people built apps with VB or Access in the 1990s?

>In the past, a programmer could learn one language (assembly) and understand and control pretty much 100% of what a computer did.

OK, sure, but when was the last time major programs were written in assembly? That was already weird / notable when WriteNow did it, in the 1980s. Developers use higher abstraction layers than that, and I think it's facile to imply it's because employers aren't willing to pay for smarter ones.

>What's the other option? Pay a bunch of actual engineers 200k a year? No thanks, that looks like a terrible choice, even if the end result is better and cheaper.

Well, it sure is faster. Including at crashing, or at causing memory bugs. :-)

I do wish fewer GUI apps shipped with their own Chromium-derived runtime. But I also don't want the days of "let's write new software projects in C; what could possibly go wrong" back.

@Sören:

Nothing written in assembly since the 1980s??? WTF are you talking about?

Pretty much everything was written in assembly (Z80, 6502, 68000) on home computers during the 1980s. It was an oddity for stuff to be written in C/C++ on smaller machines. Sure, C/C++ was available on Sun workstations, but they weren't common. NeXT boxen date to late 1989, which frankly is the 90s, and they were anything but cheap.

Most BIOS's were still written in assembler at least until 2005. Good luck booting without it. As to writing solid code in C, what do you think the kernel running your Mac is written in? Javascript? "Swift"?

I think what's so ironic about software bloat is that the basic functionality doesn't improve.

E.g.: I want to copy files from a to b.

Option 1: use a for loop and bring up an alert if a problem arises. Makes sense for small computers with no RAM, but is a bad UI because the user has to nanny the computer.

Option 2: figure out all the problems that could arise. Warn the user immediately if any do and ask for a resolution. Then proceed with the operation. Only bring up an alert at the end, if something untoward happened due to someone else modifying the filesystem during the operation. This way the user can concentrate on something else.

All standard OS's do option 1, because it's better to spend the bits on emojis or something.

@Old Unix Geek Doesn’t Finder pretty much do Option 2? It pre-checks for name clashes, insufficient space, etc.

@Michael Tsai: it does? That's good. I had problems with using it to copy things some time ago and reverted to my old unix habit of using cp in a terminal. Maybe it does. Should try it again...

@Old Unix Geek I am beginning to see Emacs grabbing more memory than expected. Perhaps your early experience is still relevant today. Few buffers open and real memory is 100 MB already. Not much yet surprising for a text based ui. On macOS though so the -nox terminal version may be leaner.

1. Incentives matter, absolutely. They may not be everything, but they matter. Unrestrained bloat is what happens when there's nobody there to stop it, and capitalists endorse it for purely economic reasons.

2. Abstractions can be powerful. Come on, please don't be tempted to throw the baby out with the bathwater. VB/Delphi, JVM/.net, scripting languages from the shell on up, modern bounds-checked runtimes, even BASIC all have had/have their place. C is an oddity really only because early computers were too small to run the compiler unaided; otherwise that would count, too. It's a shame that every one of these has been abused, but that doesn't in and of itself make them bad. Really it all comes back to point 1.

3. Dependency bloat shames us all. NIH is bad, sure, but that's no excuse for the absurdity of the dependency of certain apps, especially server software where the admin is called on to understand every piece in the puzzle just to manage it properly. You know things are bad because we have Docker and app stores, which effectively defends us against dependency hell that inevitably results from using fast-changing, monstrous stacks of dependent code much of which provides far, far more functionality than is actually needed by a project. Again if the dependencies were relatively high-quality and had reasonable shared user-comprehensible surface areas, with some amount of contractual obligation to keep compatibility, maybe the worst of this could be overlooked as merely poor choice, but no—the same incentives that drive apps also drive libraries and runtimes. So we get to use many duplicates of highly-encapsulated, vast mountains of crap much of which we do not understand that's carrying tremendous risk of breakage, insecurity or worse, and worse, those who do care about efficiency are penalised by poor support in their compiled language or choice or for avoiding containment mechanisms. Again, see point 1.

4. OTOH, I'd be lying if I said accessibility hadn't benefited from the economics of scale, because it has. Unarguable datapoint: if someone makes a web UI toolkit's widgets accessible, then a shitload of Electron apps become more accessible overnight. Accessibility should matter to all.

5. Write it in C. Sure, why not? You don't need the speed in much of a user-facing app, or even the vast majority of the logic in a server process, but why not? The abstractions I talked about earlier are powerful, but they may not be needed if you have competent C coders familiar with a given UI toolkit. No, I'm not playing God—not at all—I'm just saying that it's a reasonable choice to make, if you know what you're doing and understand the risks. To be sure, I think modern languages operate at better strata for lots of software, although perhaps unsurprisingly I think more highly of Zig and Go than Rust or JIT compilation VMs (though Tcl has been my kink for a long time for quick jobs). Once again see point 1, and note that even though many cross-platform toolkits have come and gone, none of them are uniquely as terrible as Electron.

Leave a Comment