Archive for July 8, 2019

Monday, July 8, 2019

Cloudflare Outage Caused by Regular Expression

John Graham-Cumming:

Unfortunately, one of these rules contained a regular expression that caused CPU to spike to 100% on our machines worldwide. This 100% CPU spike caused the 502 errors that our customers saw. At its worst traffic dropped by 82%.

We were seeing an unprecedented CPU exhaustion event, which was novel for us as we had not experienced global CPU exhaustion before.

Update (2019-07-15): John Graham-Cumming (Hacker News):

Although the regular expression itself is of interest to many people (and is discussed more below), the real story of how the Cloudflare service went down for 27 minutes is much more complex than “a regular expression went bad”. We’ve taken the time to write out the series of events that lead to the outage and kept us from responding quickly. And, if you want to know more about regular expression backtracking and what to do about it, then you’ll find it in an appendix at the end of this post.

Malformed iMessage Could Cause iPhone Boot Loop

Project Zero (via Hacker News):

The method -[IMBalloonPluginDataSource individualPreviewSummary] in IMCore can throw an NSException due to a malformed message containing a property with key IMExtensionPayloadLocalizedDescriptionTextKey with a value that is not a NSString. This method calls [IMBalloonPluginDataSource _summaryText] which returns the property assuming it is a string, but this is not checked. The calling method then calls -[IMBalloonPluginDataSource _replaceHandleWithContactNameInString:] which calls im_handleIdentifiers on the NSString which is really an NSNumber, which throws an exception as the selector does not exist in that class.

On a Mac, this causes soagent to crash and respawn, but on an iPhone, this code is in Springboard. Receiving this message will case Springboard to crash and respawn repeatedly, causing the UI not to be displayed and the phone to stop responding to input. This condition survives a hard reset, and causes the phone to be unusable as soon as it is unlocked. The only way I could find to fix the phone is to reboot into recovery mode and do a restore. This causes the data on the device to be lost though.

The bug is fixed in macOS 10.13.4 and iOS 12.3, but what about customers on previous OS versions? Now that the bug is known, they could be targeted. And it doesn’t seem like Apple could intercept the bad messages at the server level without decrypting private messages.

NSSecureCoding can’t really protect against this kind of mistake. Maybe Swift could have, depending on how the code was written.

I recently ran into a similar bug with AVPlayer, where using the scroll wheel calls an internal method with the wrong data type where a number was expected, causing an exception and alert window. I’m sure sort of thing happens all the time, throughout the iOS/macOS and apps, but rarely are the potential consequences so dire.

Previously:

Post-Approval App Review

NSErrorWtf (via Michael Love):

He said that app review tends to take around 10-15 minutes. App review will go “in review” 4-5 hours before the first reviewer actually looks at it. Then someone will launch it and all the diagnostic logs start trickling in. They’ll play with it for a bit. Launch/relaunch it a bit. Lots of force-quits.

The INTERESTING thing that had said started a few weeks ago was they would notice updates would get approved/released on one day. Then consistently ~48 hours after release they’d see the apple review account login again and poke around.

He suspected this was apple trying to catch app devs performing “review fraud”, where the app’s behavior changes with a server flag at a later date to try and bypass app store guidelines and such.

MacUpdater 1.4.15

CoreCode (tweet, via Leo):

While our users tell us that MacUpdater is the best app they have found in years, Apple rejected it from inclusion into the Mac App Store, because it is not ‘useful enough’. Meanwhile, Apple continues to distribute dozens of apps that are malware, or are from known Malware vendors!

I had not heard of this app, perhaps because it’s not listed on MacUpdate, either. (Besides the similar name, it competes with their Desktop product.)

It seems genuinely useful, though:

Nothing could be easier than finding out which of your apps are out-of-date with MacUpdater. Just launch it and let it scan your apps. You’ll see a list of all your apps, and apps with updates are listed in red. There are filter-options to display just outdated apps or ignore apps from being updated. The MacUpdater database has information about the latest versions of more than 30.000 apps (see FAQ).

I think there used to be another app that did this, e.g. by polling Sparkle feeds, but I haven’t heard about it in a long time.

See also: Mark Sealey.

No Engagement Algorithms

Brent Simmons:

These kinds of algorithms optimize for engagement, and the quickest path to engagement is via the drugs outrage and anger — which require, and generate, bigger and bigger hits.

This is what Twitter and Facebook are about — but it’s not right for NetNewsWire. The app puts you in control. You choose the sites and blogs you want to read, and the app reliably shows you their articles sorted by time. That’s it.

Joshua Emmons:

Brent is making a subtle point here:

  1. Algorithms weigh signal.
  2. In the domain of engagement, outrage and anger mask all other signals.
  3. These signals are fatiguing. As Outrage: 5 is normalized, Outrage: 10 is now required to move the needle.

1. and 2. mean it’s not the algorithm’s fault. There’s no way to write an engagement algorithm that doesn’t select for outrage and anger. But 3. means anything that incorporates such an algorithm actually makes us worse people.

Brent Simmons:

Maybe, though, I could do better. I kind of think not, because I think the problem is a bug in human nature. But let’s say I believed I could do better.

Should I?