Tuesday, June 27, 2017

Bug in Skylake and Kaby Lake Hyper-threading

Henrique de Moraes Holschuh:

This advisory is about a processor/microcode defect recently identified on Intel Skylake and Intel Kaby Lake processors with hyper-threading enabled. This defect can, when triggered, cause unpredictable system behavior: it could cause spurious errors, such as application and system misbehavior, data corruption, and data loss.

It was brought to the attention of the Debian project that this defect is known to directly affect some Debian stable users (refer to the end of this advisory for details), thus this advisory.

Please note that the defect can potentially affect any operating system (it is not restricted to Debian, and it is not restricted to Linux-based systems). It can be either avoided (by disabling hyper-threading), or fixed (by updating the processor microcode).

Due to the difficult detection of potentially affected software, and the unpredictable nature of the defect, all users of the affected Intel processors are strongly urged to take action as recommended by this advisory.

Via Tom Harrington:

Check your Mac CPU with “sysctl machdep.cpu” and compare to this. [Skylake list, Kaby Lake list]

Developers who are concerned can use Instruments to disable hyperthreading until reboot. See Instruments prefs.

Unfortunately, the “Hardware Multi-Threading” setting in Instruments does not persist after the Mac reboots or sleeps, so you have to keep re-applying it. The good news is that Apple should be able to offer a software update that applies Intel’s microcode patch.

Update (2017-07-05): Xavier Leroy (via Hacker News):

Late April 2016, shortly after OCaml 4.03.0 was released, a Serious Industrial OCaml User (SIOU) contacted me privately with bad news: one of their applications, written in OCaml and compiled with OCaml 4.03.0, was crashing randomly. Not at every run, but once in a while it would segfault, at different places within the code. Moreover, the crashes were only observed on their most recent computers, those running Intel Skylake processors.

[…]

SIOU didn’t take my suggestions well, arguing (correctly) that they were running other CPU- and memory-intensive tests on their Skylake machines and only the ones written in OCaml would crash. Clearly, they thought their hardware was perfect and the bug was in my software. Great. I still managed to cajole them into running a memory test, which came back clean, but my suggestion about turning HT off was ignored. (Too bad, because this would have saved us much time.)

Update (2017-12-28): I contacted Apple in early August and was told that the bug would be addressed in High Sierra. However, as of macOS 10.13.2 there is no update for the processors. Mine still shows:

machdep.cpu.brand_string: Intel(R) Core(TM) i7-7700K CPU @ 4.20GHz
machdep.cpu.family: 6
machdep.cpu.model: 158
machdep.cpu.extmodel: 9
machdep.cpu.extfamily: 0
machdep.cpu.stepping: 9
machdep.cpu.signature: 591593
machdep.cpu.microcode_version: 88

I spent a couple weeks in December going back and forth with Apple’s support people, trying to get an answer. None of the advisors had even heard of the bug. Eventually, a senior advisor escalated the issue to engineering. The engineer confirmed that Apple is aware of the bug but does not yet have a fix. They will not say when or even if Apple will ship a fix. Apple’s policy is not to comment on unreleased products.

Looking at Intel’s spec sheet, it appears that KBL095/KBW095 were not fixed as of August, so perhaps Apple is waiting on Intel for the Kaby Lake fix. (Intel has had an update for Skylake CPUs since at least June, but it’s not clear to me whether Apple has shipped it.)

For now, Apple recommends keeping hyper-threading turned off to prevent data corruption. This can be done by unchecking the “Preferences ‣ CPU ‣ Hardware Multi-Threading” setting in Instruments. Unfortunately, this setting is not preserved if you sleep or restart your Mac. Apple confirmed that there is no way to make the setting stick. (Years ago, the nvram command could be used.)

My recommendation is to set the Mac not to sleep the CPU and to add Instruments as a login item so that you remember to disable hyper-threading when your Mac starts up.

It’s not clear to me whether the Xeon W processors in the iMac Pro are affected.

See also: ArsTechnica, Hacker News, MacRumors, OCaml Mantis, Reddit, StackExchange.

5 Comments RSS · Twitter

[…] Between the faster processor and SSD, everything feels faster. Building SpamSieve takes 1m15s (while the iMac was doing some Spotlight stuff and FileVault encryption in the background) vs. 1m54s on the MacBook Pro. Lightroom is also much faster at importing photos and building previews. Unfortunately for a desktop computer, the iMac seems just as prone to turning on its fans as the MacBook Pro. It’s fast, but just opening up Xcode makes it sound like it’s really working hard. And there is the aforementioned Kaby Lake hyper-threading bug. […]

[…] I asked Apple about the Skylake/Kaby Lake hyper-threading bug, I was told that it would likely be fixed in macOS 10.13, but I see no evidence that High Sierra […]

[…] notebooks haven’t been updated recently. The iMac was updated in June 2017 and still has a defective processor. The MacBook Pro was updated this July, but the keyboard remains a question mark; we don’t […]

[…] Bug in Skylake and Kaby Lake Hyper-threading […]

Leave a Comment