Monday, March 11, 2019

The Sad State of Logging Bugs for Apple

Corbin Dunn (tweet, Hacker News):

This is where things get screwy depending on the component your bug lands in, since bug management is group dependent. Many groups will have only one or two QA people to do the initial screening of those large drop areas for bugs. QA engineers are sometimes instructed to screen bugs with a priority and “fix period” before passing them off to the engineer responsible for the code. This is terrible because many engineers will not look at bugs with a low priority. It is much better for the engineer who “owns the code” to look at a bug and determine the priority. The QA engineers will frequently get a huge back log of bugs to screen, and it can take weeks, or even months, for some bugs to get screened. Sometimes this leads to a mass screening of bugs, marking them all with a low priority. Bug originators have to notice this, and complain about it for the priority to get increased. Worse yet, some groups mass close bugs older than a year or so, and ask the originator to re-open the bug if the issue still exists. A lot of people don’t pay attention to bugs that need verification, and they simply become lost.

[…]

Engineers also dislike screening bugs because sometimes they have to add them to their queue for the current release. This increases their required workload for that release, which is something people don’t like doing. So, instead, many bugs stay unscreened.

[…]

Sometimes QA screen bugs with a low priority and holds onto them. They never get moved to the appropriate code engineers, and effectively become lost in the system. Sadly, I had seen this happen way too often.

[…]

When a bug is sent back as fixed, the internal developer who originated the bug is supposed to verify the problem is resolved. They can send it back if the problem isn’t resolved. However, internal developers don’t really have an incentive to verify bugs. Management doesn’t keep track of bugs that need verification or really require developers to verify them. Most engineers do verify bugs; they like to make sure problems are resolved. But external developers are left in a more sad state. The bug becomes closed for them, and is dead.

[…]

Internal engineers need to take more responsibility in promptly screening bugs. Management needs to allow engineers to have more time to do this, which is at the expense of working on features or fixing already screened bugs. Engineers should always be expected to have a very low unscreened bug count.

This matches what we felt like must be going on when filing bugs, as well as the way the smaller bugs seem to hang around forever, with new ones added each year. Even Mojave, which was supposed to be a refinement release, seems to have, on balance, increased the number of bugs. As a user, it sucks that things don’t work as well as they used to. As a developer, I spend too much time working around OS bugs and breakage—in other words, preventing my apps from getting worse rather than actually making them better. I assume other developers are in the same boat, and this may be one reason there seems to be less excitement around apps these days. Everyone is wasting a lot of energy treading water.

It’s as if the OS is rotting away before our eyes. The good news is that this should be fixable. Apple has tons of smart engineers who care. But the process does not seem to be set up to produce quality. Management talks a good game but clearly has other priorities. There are undoubtedly many policies that could be changed to improve the organizational incentives, and a core problem seems to be that Apple remains understaffed for its ambitions. The headcount can’t and shouldn’t be massively increased in a short period of time, but there is something Apple could do today to help stem the tide: go off the annual schedule.

Peter Ammon:

The single easiest and most effective thing Apple could do to improve its SWE org is to invest in Radar.

Radar’s importance within Apple cannot be overstated. It subsumes what would be multiple tools in other orgs. As an Apple SWE you spend a massive amount of time in it. And yet Apple treats Radar as a cost center, developed by an outsourced offshore team. It’s slow to search, supports only plain text, is hard to script, and is missing obvious features, e.g. automatic duplicate finding.

Hire five good SWEs, give them a mandate to serve the needs of the org, and you will massively increase the effectiveness of every other engineer.

ThemalSpan:

Anecdotally, I didn’t find the situation internally to be much better. Many bugs internally go unanswered because there is not enough time allocated to fixing core systems and designing better replacements. The truth is, I know personally of several teams that aren’t able to get through the queue of internally filed and scheduled bugs.

To me, it feels like Apple hasn’t resourced core pieces of infrastructure and engineering teams in line with upper management’s plans for growth. While many teams are relatively sequestered, once you start talking to folks elsewhere in the company it becomes clear that many teams are struggling to stay above water. More still, everyone shrugs about it because it’s not clear exactly what is wrong. The best description I’ve heard is in many cases engineers are willing to offer hacks as a solution to meet management’s demands, and management is either willing to accept those hacks or doesn’t know better.

satisfice:

We originally designed Radar so that bugs would be verified as closed by the person with most interest in seeing this happen: the tester assigned to that part of that project. Then management swooped in with an edict that bugs must be verified as closed by whomever originally reported them. This is a stupid idea, because it creates the perverse incentive that no one should report a problem if they are outside the team (because then you are committing to verify the fix, which just means more work for you that has nothing to do with any of your main responsibilities).

When I pointed out that the system would now discourage people on different teams from helping each other, the sponsoring director said “that’s what pink slips are for.” Direct quote. Soon after that I resigned from the design team.

Without reasonably skilled and principled leadership, you just don’t get quality software. And “quality is everyone’s job” is just an empty and childish slogan. Excellence is not transmitted through slogans and wishful thinking. You have to assign responsibility, provide resources and time (which means lowering velocity of new development), and follow-up.

The fundamental reason why it doesn’t happen is the technology market is not efficient. Quality is, in fact, not as important as career testers wish it were. You can get away with doing terrible work and not lose your job. The fact that Apple pays no significant penalties for having buggy products insulates it from our slings and arrows.

Corbin Dunn:

Some obvious issue, like “this button should do X but it does Y” can be verified by almost anyone. But some issues need the attention of the original author to really verify the bug. Maybe what needs to be done is someone in QA needs to attempt to verify the bug, or “pre-verify”, and then it goes back to the originator for final verification, who can also verify it, or simply close it if they feel like QA did a good job.

drfindley:

What’s even sadder is we used to be better at this when I started at Apple in 2008. Bugs often got screened and triaged and sometimes fixed within a week. I blame the yearly release schedule, where shipping features became a higher priority than overall quality

Corbin Dunn:

I feel the same way; people took more time on bugs back in those days. I also think the yearly schedule is to blame.

akecheck:

When a process is annoying and you do nothing, people eventually do give up and leave. When it reaches that point, they’re not coming back even if you finally wake up and fix what bugged them.

Apple software quality is in serious danger precisely because of this type of community and infrastructure rot. They are not encouraging developers to help them, and a not-surprising number of serious issues have shown up in released products in recent years.

jakobegger:

By now, a significant fraction of bugs are bugs in Apple’s frameworks. We try to report them to Apple, but they are ignored, or simply closed because they are related to deprecated APIs.

Of course, customers don’t complain that Apple frameworks are buggy -- they complain that our app crashes! So Apple has no incentive to fix it.

Peter Kamb:

The entire value of WWDC is going to the Labs, giving an Apple engineer your Radar number, and having them read you or paraphrase the internal-only notes attached to the ticket. Half the time the question/bug will be clearly resolved internally or a workaround posted. But no updates are added to the public ticket, and it will remain open and unchanged for years.

Gus Mueller:

Corbin is a former AppKit engineer, and this is a must read for developers. It’ll make you angry, and it’s stuff you already figured was happening.

Tanner Bennett:

This confirms what we already knew. Almost no one at Apple takes bug reporting seriously. Reports will stack up indefinitely and eventually macOS will be a shell of its former self.

Corbin Dunn:

It is not just macOS, but iOS too.

Paul Haddad:

Interesting read. My view on bugs, work around them and move on. Even if it gets assigned it’s not getting fixed for at least a year.

Jeff Johnson:

IMO Radar screening issues are merely a symptom. The root problem is that Apple produces a completely unmanageable volume of bugs. Even if they screened all Radars quickly, then what? Bugs still get written much much faster than they get fixed. That’s unsustainable.

I suspect that Radar screening is allowed to be lax precisely because everybody knows that a huge volume of bugs will never get fixed anyway. It’s like bailing out the Titanic.

There’s also a tolerance for shipping bad bugs. If heads rolled at the company for shipping bad bugs, then Radars would get screened.

Adam Savage:

Just getting my music downloaded to my phone is a recurring nightmare I relive every time I upgrade. Having my music ON my device should be a simple choice, & you’ve made it Byzantine. How is it that I have to visit a support forum to learn how to download SONGS to my PHONE?!

The language of permissions is still fascinatingly and infuriatingly opaque, to the degree that when using iTunes, I’m regularly convinced it has an agenda antithetical to mine. Searching Suport forums is also nightmarish as helpful buttons from one version disappear in others.

Previously:

Update (2019-03-20): Michael Nachbaur:

It’s easy to blame Apple for poor bug handling practices, but I feel it’s a two-way street. It’s just as much our responsibility as theirs to ensure important bugs get fixed; we should do everything in our power to make their jobs easier in solving bugs. And if we can’t, then at the very least we can treat Apple’s engineers with respect.

Update (2019-09-09): Tom Harrington:

Apple FB 7074633 relates to how Apple sign in sends data to third party web sites.

They asked for a sysdiagnose.

Our server doesn’t run macOS. Or iOS. And the problem is with Apple’s server anyway.

This is why people give up on filing feedback reports.

10 Comments RSS · Twitter

[…] Maybe things will turn around later this year, with the Mac Pro and rumored new pro notebooks, but right now we’re in quite a dark period for the Mac—both hardware and software. […]

Is this the same corbin dunn that said people shouldn’t be doing code reviews?

I quit filing bugs after a very serious and commonly reproducible bug that I discovered with Siri was ignored -- even AFTER I emailed the appropriate department head at Apple (which took LOTS of googling to find, and they responded to my email so I know they are aware of the bug), AND I was ALSO in contact with a "customer relations" type of person somewhere at the VP-level of the org about this specific bug, AND the bug was assigned to an Siri software engineer to fix it, AND after iOS 10, 11, and 12 came and went and it still was NOT FIXED despite me following up once a year for 3+ YEARS... I gave up.

That told me all I needed to know, if I can find a serious bug and make sure that the appropriate people are aware of it (after getting nowhere in Bug Reporter) and it still doesn't get fixed through many major iOS releases... then I know they really, really don't give a shit that a core functionality of Siri doesn't work correctly, despite all of their hype about how they are continually improving it.

And I'm pretty sure the bug could be fixed in less than a day if someone actually cared.

"When a process is annoying and you do nothing, people eventually do give up and leave. When it reaches that point, they’re not coming back even if you finally wake up and fix what bugged them.

Apple software quality is in serious danger precisely because of this type of community and infrastructure rot. They are not encouraging developers to help them, and a not-surprising number of serious issues have shown up in released products in recent years."

That strikes painfully close to home, in a way. Lately not a day goes by when I don't find myself frustrated at having to _wait on my computer for trivial things_, not for a lack of computing resources. By "trivial", I truly mean trivial: stuff like dragging files between folders, and having to wait for the animation to finish, so that the file doesn't land in the wrong directory en route. Or waiting on Safari to finish its.. whatever it's doing (preloading/prefetching/all-the-pre's being disabled) before pressing enter, in order to not have it merely refresh the current page.

It's Bad. And annoying. And improvements seem beyond hope.

"IMO Radar screening issues are merely a symptom. The root problem is that Apple produces a completely unmanageable volume of bugs. Even if they screened all Radars quickly, then what? Bugs still get written much much faster than they get fixed. That’s unsustainable.

I suspect that Radar screening is allowed to be lax precisely because everybody knows that a huge volume of bugs will never get fixed anyway. It’s like bailing out the Titanic."

This commentary sounds immature to me. Every software company is going to produce a ton of bugs — that's just how software works. And I seriously doubt engineers are failing to fix bugs out of a primary motivation of "there's just gonna be more bugs anyway". This is not a mindset engineers who work at a company have — I've worked on several teams and there's always dozens or hundreds of bugs in the backlog waiting to get fixed, but you always chip away at them, even if new ones are introduced just as fast.

It seems more likely the process is broken, like every other person quoted here mentions. But Jeff Johnston generally seems unaware of how software companies actually work so I'm not surprised.

The last bullet-point in Michael's list, "Apple's Software Quality Decline", was from 2014. That post links to a tweet from Daniel Jalkut:

"The biggest/richest company in the world, already staffed with many of the smartest and most creative people, shouldn’t get so many passes."

How many more passes are we willing to give Apple, nearly four-and-a-half years on? I ask this question as much for myself as for anyone reading this... I'm so used to how the Mac works, it's hard for me to switch even as I see year after year of deterioration.

Dan writes: “Every software company is going to produce a ton of bugs — that's just how software works.”

This is precisely why most software developers should not be called “engineers”.

@Ben Kennedy

So much this. Most of them are Software Writers, they are far from any Engineers. One reason I value Computer Engineering Degree from EE rather than Computer Science. Although that was back in my days, not sure about now.

20-year Windows developer here. I can't understand why people are so willing to be the testing department for Apple, a fantastically rich company. It may have been warranted when Apple was a scrappy little company; now it's a juggernaut but you still treat it like it needs nurturing.

When I experienced bugs in Windows itself, I just cursed the darkness like we all do. When I found bugs in development tools or run time libraries, I worked around them. I knew Microsoft didn't care (enough) to listen-- but that's good! I did not have to spend half a day documenting a bug in exquisite detail just to have it be ignored.

From my reading here about MacOS, one thing Microsoft does better is that libraries are published at various times, and might go into a Windows Update, not just into the next yearly release. I've not seen bugs where a UI element can't be made to work at a developer level- they generally just work. (Probably because the developer can override/configure many behaviours to mitigate wierd bugs)! Microsoft has to deal with many hardware variations and driver situations, something Apple does not have to worry about.

Getting back to Apple bug fixing, I suspect that everyone in the development side would be happy to fix bugs. But the culture of today has an Agile mentality- does this story point display in the UI? Then it is done. The hard work of software development is not creating the software, it is completing it! The old saw is that the last 10% of the time is used to do the other 90% of the work. Nobody does the last 90% anymore, there is no money in it.

Leave a Comment