Friday, January 21, 2022 [Tweets] [Favorites]

User-Friendly Diagnostics Should Be a Core Part of Any System

Howard Oakley:

If my Mac is capable of recognising the faces of friends in their photos, then it’s surely able to provide me with a little assistance in diagnosing the cause of an error, and suggesting what I can usefully do next. If Disk Utility can’t unmount a container, can it please explain why that could be, and at least link to an article like this one?

Software engineers are hopeless optimists when they design and code only for success. There’s much more to handling errors than displaying a couple of phrases of in-house jargon and fobbing the user off with a magic number. It’s high time that designing error-handling to help the user became a central tenet of macOS.

Nick Heer:

My only quibble with Oakley’s conclusion here is that it should not be limited to MacOS; I expect better diagnostics across all of Apple’s operating systems. Otherwise, this is spot on.

It is bananas that the best error messages users will encounter are those with an inscrutable code — “the best” because it is at least something which can begin a web search for answers.

3 Comments

He's right, we need better messages. And in many cases, software could do better. However I don't believe this problem is soluble in the general case. Writing bug free software would probably be easier, and the market doesn't exactly encourage that.

Each piece of code assumes certain things are true about the computer's state. If these things are true, it can perform its function. If the state is not as expected, if it is well written, it will detect that and emit an error, rather than proceed.

But that piece of code won't know why the computer's state is not what it expected, and so can't tell you, and if it says what it expected (e.g. "Error: binary search expected sortedArray to be sorted, in stack trace: ..." or "Error: malloc failed in ...") no one who doesn't have and understand its source code understands what that means.

It could be that the piece of code itself is incomplete or has a bug, or that a previous piece of code that created that state has a bug, or that the hardware did something unexpected.

Then there's the problem that most users have learned not to read error messages (either because they are useless as on the Mac or because they are incomprehensible has on Windows).

For example, consider a piece of code has to solve a simultaneous pair of equations: ax + by + c = 0 and dx + ey + f = 0. Every test the software company threw at it passed. However, that piece of code has a hidden fault: if b = 0 calculating the solution causes a division by 0. Because dividing anything non-zero by zero results in infinity, the hardware throws a fault when it sees a division by zero. It doesn't know about the function, or what it is trying to do. All it knows is that it was asked to do something impossible. Neither does the OS. In the old days, Windows would say "Division by zero at address C00...", which was all it knew, and people would just grit their teeth and ignore the message.

The fundamental problem is that computers are blind watchmakers, and do not understand what they are doing. That hasn't gone away just because just computers are faster. Recognizing people's faces is different: it is a tractable problem that is now possible because computers are faster.

@Old Unix Geek Errors are unavoidable because of potential bad state and interacting with other components that are fallible. It’s true that the error message may not be actionable by the user, but they’re still useful if they give you something to Google or report to the developer or support person.

I'm not against reporting them. In fact my code is full of assert statements. I'm of the "abort early rather than cause a mess" school.

But I also know most users I have tried to help with errors don't even read the error message, or remember them, however often I tell them to. I guess only 10% do... and I can't really blame them when the text reads just like "Component #9Fk pre-aborted due to a fluxorator overcompensating for drascatic density in dimension 72-omega" to the uninitiated.

Stay up-to-date by subscribing to the Comments RSS Feed for this post.

Leave a Comment