Thursday, November 21, 2019

Full Steam Ahead, But With Feature Flags

Mark Gurman (tweet, Hacker News):

Apple Inc. is overhauling how it tests software after a swarm of bugs marred the latest iPhone and iPad operating systems, according to people familiar with the shift.

Software chief Craig Federighi and lieutenants including Stacey Lysol announced the changes at a recent internal “kickoff” meeting with the company’s software developers. The new approach calls for Apple’s development teams to ensure that test versions, known as “daily builds,” of future software updates disable unfinished or buggy features by default. Testers will then have the option to selectively enable those features, via a new internal process and settings menu dubbed Flags, allowing them to isolate the impact of each individual addition on the system.

When the company’s iOS 13 was released alongside the iPhone 11 in September, iPhone owners and app developers were confronted with a litany of software glitches. […] This amounted to one of the most troubled and unpolished operating system updates in Apple’s history.

[…]

Test software got so crammed with changes at different stages of development that the devices often became difficult to use. Because of this, some “testers would go days without a livable build, so they wouldn’t really have a handle on what’s working and not working,” the person said.

[…]

Still, iOS 14 is expected to rival iOS 13 in the breadth of its new capabilities, the people familiar with Apple’s plans said.

It sounds like they are still in the denial. Feature flags may be a useful tool to help with testing, but much more drastic changes are needed. They don’t seem to have much interest in reducing the scope of major releases, so I would like to see them drop the annual release schedule. And, above all, make an internal commitment to quality.

The testing shift will apply to all of Apple’s operating systems, including iPadOS, watchOS, macOS and tvOS. The latest Mac computer operating system, macOS Catalina, has also manifested bugs such as incompatibility with many apps and missing messages in Mail.

The missing Mail messages bug remains unfixed in macOS 10.15.2 betas. This is the buggiest Mail release I can recall. I’m still busy working around Catalina bugs throughout the system.

Apple privately considered iOS 13.1 the “actual public release” with a quality level matching iOS 12. The company expected only die-hard Apple fans to load iOS 13.0 onto their phones.

And yet customers were automatically prompted to update to 13.0, and even 13.2 introduced major problems.

Peter Steinberger:

Feature flags in teat releases are Apple’s answer to the software quality issue? What about automated testing? And opening up hiring outside of Cupertino, to deal with the amount of radars and missing documentation.

ssɐquʞunɹp:

I can tell you from experience that these “feature flags” carry a lot of tech debt that these managers don’t seem to understand. This may be the canary in the coal mine.

Michael Dupuis:

How about slowing things DOWN? It’s very much a feeling that they are just throwing things over the fence as fast as they can, and it shows in the horrible quality we’ve been seeing...

Kyle Howells:

Adding feature flags to betas isn’t the answer. It’ll just add more work.

Keeping the same process but adding extra steps doesn’t generally work.

They need to slow down, only release software when it’s ready, and prioritise quality, documentation and fixing bugs.

Jeff Johnson:

Annual OS releases are also destroying third-party software quality. We can’t keep up with the constant churn, and the tools are never stable. We waste so much time every year just dealing with Apple’s shit.

Jeff Johnson:

Apple’s software quality problems can’t be solved in iOS 14. They’ve accumulated at least 5 years of technical debt, if not more, from annual releases.

They’re deep down in a hole. Desperately in need of years without a major update.

Previously:

Update (2019-11-26): Mark Gurman:

iOS 13 has had 8 updates in its first two months, the most in that same period since Craig Federighi took over development with iOS 7. See chart.

Scott Anguish:

It’s a myth that Apple doesn’t have remote writers. They have an entire department in Seattle.

Thomas Clement:

And this only works if Apple can detect before shipping that a feature is broken enough that it needs to be turned off.

Dr. Drang:

An old saying from the making of physical products seems apropos: you can’t inspect quality into a product.

Norbert M. Doerner:

They need a massive OS release moratorium, and look at what they have done, and why that failed. Then start fixing the bugs, and change the crazy yearly release cycle, it is utter madness #Apple #StartFixingTheBugs

Jeff Johnson:

Months since previous Mac .0 release:

10.1.0 6
10.2.0 11
10.3.0 14
10.4.0 18
10.5.0 18
10.6.0 22
10.7.0 23
10.8.0 12
10.9.0 15
10.10.0 12
10.11.0 11
10.12.0 12
10.13.0 12
10.14.0 12
10.15.0 12

A sensible progression... until 10.8

(Note that Steve died after 10.7)

Griffin Caprio:

Anyone who’s built even a moderately complex app knows you can’t just pepper in if/else statements and iOS is more than moderately complex.

Patrick McCarron:

The amount of technical debt those flags carry are no joke. Not always a clean removal either.

Update (2019-12-20): See also: MacRumors, Macworld.

21 Comments RSS · Twitter

Hiding the brokenness of new submissions is going to make quality exponentially WORSE. These incompetent engineers on dysfunctional teams only fix half the bugs they create BECAUSE other teams are screaming at them. All this new scheme does is ensures that everyone can lie about their progress & timelines. They still have to turn everything on by default at some point, and that's when all the dire integration issues will arise. The longer they wait, the deeper and more difficult the regressions will be to reverse engineer and resolve. Apple is completely screwed.

I've been wondering for a while what has made both the Mac, and the Mac ecosystem worse. Progress seems to be slower, and bug counts seem to be higher.

The yearly release cycle seems like an easy thing to blame. Not only is apple releasing faster than they can keep up with, they are releasing faster than devs can keep up with. Which slows progress of the whole ecosystem, which makes the whole ecosystem less productive. Developers don't have time for new features, since they spend so much time just trying to keep up to date.

The other place I wonder about blaming is Swift. The most obvious way Swift can slow things down is rewriting code. How much time has apple spent refactoring objc api's to work with swift and/or replacing obj c api's with swift api's? How much time have developers spent refactoring objc code to be more swift compatible and/or replacing objc code with swift code? Have developers spent years rewriting apps to get the same app they had to begin with? I'm sure not all apps have done this, but I'm also sure that at least a few apps have.

The other Swift question I have is does Swift actually make devs more productive? Do apps actually crash less? This could be all circumstantial, and the popularity of the web/ios could surely be the real resource sink, but as a customer/user of the mac ecosystem there is no way I would say that things are better now than they were before Swift. This could totally be a red herring and is pure speculation, but I read a lot about how amazing Obj-C was and how superior it was to C++, and now we appear to have a language developed and inspired by C++ way more than it is inspired by Obj-C.

It also amazes me that one of the biggest companies in the world, cannot afford to have at least 1 dev assigned to every app they ship by default in their software. I'm astonished how many times a new release comes out yearly, with 0 work done on built in apps. I'm not saying they need to add a million features, but Apple can go years without touching apps that have bugs.

This is such a weird situation!

I have an expensive computer in my pocket that gets buggy updates pushed to it. Every other year, I get an update that changes the UI and contains more bugs. Some third party programs disappear at each point upgrade. The OEM programs sometimes get bugs that never get fixed. I can't write my own software for it without paying an annual fee.

So, how do I use it? If I only use it for critical applications, what if they get a data loss bug? Or a key UI element is removed? Then, if I only use it for trivial applications, why do I even have something this expensive?

I can't decide how to handle this. Rely more on SAAS web app stuff? Say the heck with security updates and stick with the last known good iOS 12 version? I'm not happy with having to check forum posts for bug reports. I don't think that should be a necessary part of my user experience. Could I get a flip phone, and commit to using a laptop for everything else?

I find myself using fewer apps, services, and devices made by anyone over time. That's a bizarre trend for a tech guy.

> Adding feature flags to betas isn’t the answer. It’ll just add more work.
> Keeping the same process but adding extra steps doesn’t generally work.
> They need to slow down…

It’s embarrassing that the only people who apparently don’t get this are in charge of Apple’s development process. The whole idea of hiding features from testers is lunacy and self-deceit.

@trecento
Yep, this largely mirrors my experiences and frustrations. After 15 years, I've stopped buying 3rd-party Mac apps because the OS feels like a dead end now.

Sometimes I think of all the money Apple have spent over the years — on Beats, dividends, TV shows — all while their core competencies have been in decline.

One-year release cycles are great. If you allow teams to sit on things for too long, you'll find that they'll either release products that no longer solve the right problem, or worse, release products so late that they can no longer be integrated into a changed updated system.

However one-year release cycles are heavily dependent on good management: specifically three things

1. The ability to scope work: to know how long something will take; and therefore how many things can be released in a year; and therefore be able to tell sales and marketing what they should prepare to sell
2. The ability to prioritise and push-back, based on scope, to only publish the features that can be delivered at an acceptable quality in a given time-frame.
3. The ability to ensure quality in a rapidly changing environment, which requires intensive automated testing: unit-tests, component-tests, end-to-end tests, and _lastly_ human testing.

Craig does not appear to have built an engineering team with these skills. To be clear, Apple still iterates much better than many other vendors, notably Microsoft. But recently they've not scoped work properly, over-committed, and then sacrificed quality to meeting (pre-approved) marketing promises.

The one-year release-cycle isn't the issue. Management of scope, prioritisation and quality-assurance are.

"Desperately in need of years without a major update."

That's… absurd. No. The next major release only needs to be more stable than what's out there right now for it to be an improvement.

Better is a worthwhile goal along the road to much better.

"The next major release only needs to be more stable than what's out there right now"

That's not how it works. That's not how any of this works. So many people misunderstand the concept and say "We need another Snow Leopard release." The reality, though, is that Mac OS X 10.6.0 was very buggy, much buggier than 10.5.8. Major releases are always buggier than their immediate predecessors. The stable Snow Leopard you remember was at the end of 2 full years of minor bug fix releases. Every major .0 version is buggier than the minor version that preceded it, because major updates bring major changes. The only way to achieve stability is by fixing bugs at a faster rate than you add them. Minor updates do this, major updates don't. Adding features adds bugs.

It's certainly possible for iOS 14.0 to be more stable than iOS 13.0. But 14.0 will without a doubt be much buggier than 13.N, for whatever N is the latest version in August 2020.

You’re assuming development practises don’t change. I’m not.

We don’t need a perfect release and 14.0. We need something more stable than iOS 13 point whatever. That should be achievable, and if not this cycle then maybe next. And even if it’s not achievable soon, it should be the goal.

FWIW iOS 12 was less buggy than iOS 11 and also massively faster. This may be the rare exception but it proves Apple can release a major new version smoothly.

I think the sad reality is that in general, we don't really know how to make software at the scale of Mac OS X, or iOS. For a company like Apple, brute-forcing the testing and just having people pound at these devices until most bugs are found is probably not the worst option, and using feature toggles to make sure testers can actually pound at the devices in a meaningful way might be an improvement to not using them. One alternative is what Microsoft is doing, which is just deploying buggy versions of Windows to a subset of its users, and figuring out what's broken that way, thus occasionally royally screwing people.

Also, about the yearly releases: if your team is synchronized to yearly releases, and you have to push out something every year, you basically end up with a few months of actual focused development time, because every release requires a lot of cleanup work before the release, and a lot of cleanup work after the release. Yearly releases create a ton of overhead, which isn't good for velocity, and isn't good for quality.

Yearly releases can work if individual teams are not required to ship at that cadence, but you still have enough teams working on features that every release receives meaningful new additions. It's not clear to me if that is what Apple is doing, or if every team is shipping at a yearly cadence.

Sören Nils Kuklau

FWIW iOS 12 was less buggy than iOS 11 and also massively faster.

12 was faster in part because of optimizations they wrote (at numerous levels — kernel, runtime, … — heck, 13 contains an optimization to FairPlay decryption, making apps launch faster).

Those are once-in-a-lifetime opportunities. There’s plenty more optimizations to be had (especially when it comes to Swift, where they do continuously improve on various APIs), but there isn’t magic to it.

Maybe they prioritized optimizations more during the 12 cycle; that’s possible. But those were going to ship sooner or later no matter what, regardless of how buggy or not buggy the release is.

I think the sad reality is that in general, we don’t really know how to make software at the scale of Mac OS X, or iOS.

Right. At the end of the day, software development as a craft is about half a century old, not several millennia like bridge building.

One alternative is what Microsoft is doing, which is just deploying buggy versions of Windows to a subset of its users, and figuring out what’s broken that way, thus occasionally royally screwing people.

Well, with Windows 10, they set a biannual schedule like Ubuntu, only they can’t really keep up. 1511 shipped in November 2015, but 1607 in August, 1703 in April, and they’ve since slipped even further, with 1809, 1903 and 1909 all shipping two months later than their names were once intended to suggest.

That’s presumably for quality reasons. So it is possible to delay if the quality isn’t quite there. Apple chooses not to do that. They could’ve put 12.4 on the iPhones 11 and then shipped 13.0 in late October or early November. Would there have been annoyed customers and stupid media stories about how Apple is doomed and virtually bankrupt? Sure, but they got a negative reaction either way.

@Bryan Feeney: ”The one-year release-cycle isn't the issue. Management of scope, prioritisation and quality-assurance are.”

That, and your whole comment, is a very good point.

@Jeff Johnson:

I think you're missing the point about why people talk about Snow Leopard. It's not only the amount of bugs, it's also about the intention from Apple. Snow Leopard was actively marketed as bugfixing, optimizing and no features. To quote Wikipedia:

Unlike previous versions of Mac OS X, the goals of Snow Leopard were improved performance, greater efficiency and the reduction of its overall memory footprint. Apple famously marketed Snow Leopard as having "zero new features".

This is what people want when they talk about Snow Leopard. A release that Apple says is all about stability, efficiency, performance. Nothing else. (Of course, the initial release of even such an effort will have bugs, that has to be fixed, but without major new features and changes they will hopefully be less severe.)

So what people want is... marketing?? The "0 New Features" keynote slide was largely tongue-in-cheek, and according to the quoted Wikipedia page, it was largely a falsehood. A lot changed in Snow Leopard. And there were a number of really severe bugs in the early 10.6 versions. The road to hell is paved with good intentions.

Personally, I'm not interested in words and empty promises, I'm only interested in the results.

The Snow Leopard experiment cannot be repeated in 1 year now, because Snow Leopard was the result of a preceding historical context. Every major update generates technical debt, and that technical debt needs to be paid down.

The 10.5.0 - 10.5.8 cycle was 21 months.
The 10.4.0 - 10.4.11 cycle was 18 months.
The 10.3.0 - 10.3.9 cycle 18 months.

Thus, 10.6.0 was preceded by versions where the technical debt had already been paid down substantially. But iOS has had an annual cycle since the beginning, and macOS has had an annual cycle since 2012. Thus, the current versions have been preceded by many years of technical debt being generated without ever having the time to pay back the debt. Longstanding bugs never get fixed, and they pile up more and more over the years. That's why you can't solve this problem in just 1 year.

Sören Nils Kuklau

The “0 New Features” keynote slide was largely tongue-in-cheek, and according to the quoted Wikipedia page, it was largely a falsehood. A lot changed in Snow Leopard.

Under the hood, yes, but as far as user-facing features went, there wasn’t that much.

Of course, the initial release of even such an effort will have bugs, that has to be fixed

People romanticize 10.6 Snow Leopard a lot, I find. It had, until before 10.6.2, an issue where the wrong user account would be wiped when logging out of a guest account. It had bugfixes involving DVD playback and Ethernet connections. An issue with the trackpad becoming unresponsive. And so forth. There’s a reason it went all the way up to 10.6.8 — not just that it existed for a long time, but also that 10.6.0 really wasn’t as solid as people make it out to be.

Yearly updates should be kept for stylistic changes.
The team that would be responsible for how it looks and feels can make those changes in a year, it's enough time, especially if they can avoid huge changes and only introduce them once in 5-7 years.
All other features can and should be released when they are ready and well tested. They still can introduce what they will be working on for the next year during WWDC, but it does not have to be tied to major OS release.

It seems that in general the whole development process suffers from old-fashioned waterfall approach.
It's hard to believe that Apple's developers use AUT, with full code coverage, and UI verification.

”So what people want is... marketing??”

Call it what you wan't but all other things being equal I'd rather hear Apple say "we're actually trying to fix what's broken" than "look at all these shiny ugly feature that no one wants". Even if it only get's us half the way, it would still be better.

>Apple privately considered iOS 13.1 the “actual public release”

And yet if you want the security updates you're forced to upgrade and live with all the new bugs. At least on macOS we still get security and bug fix releases even once the next major release is out. Wish iOS had that!

@bob
Thank you.

Tangent, my major peeve with modern tech reporting is hammering Android on updates (even if often rightfully so) while simultaneously giving Apple a pass for iOS. You cannot get security updates unless you update your entire OS. Warts and all coming along for the ride.

At least Android rightfully decoupled security patches from OS updates which means OS updates are much more important to me. I have Android 5 systems still receiving security updates (from Amazon, given it is Fire OS), but also Android 7 and 8 devices too. Not to mention some features even arrive without OS updates, given changes to Google Play services shipping for older versions of the OS (on Play enabled devices obviously).

Leave a Comment