Skylake
As has been the case for many years now, reducing power consumption remains Intel’s top priority for Skylake. Not only does reduced power consumption enable the company’s processors to be used more widely—client Skylake processors will span everything from 4.5W tablet and ultralight systems up to 95W desktop devices, a 20-fold difference in power envelope—it also enables greater performance. Reduce the power used by one part of the chip and the extra thermal headroom (and current draw) can be spent on other parts of the chip; this is the underlying principle of Turbo Boost.
[…]
With this new design, the eDRAM is always coherent, since it is privy to all writes made to main memory, regardless of which core makes them. This also means that it can cache any data, even if it’s stored in memory that is marked as “uncacheable” by the operating system. The design also enables both PCIe devices and the display engine to read to and write from the cache.
[…]
Skylake has some “more of the same” aspects to its power conservation—more individual parts of the processor can have their frequency adjusted or powered down to allow finer tuning of power consumption—though these have been extended. For example, most code either never uses the AVX2 instruction set, or uses it extensively; it’s rare for applications to only use AVX2 every now and then. When faced with workloads that never use AVX2, those instruction units are powered down.
[…]
In Skylake, the power management is more cooperative. The operating system still has some control—for example, it can force a low frequency for extending battery life, or more commonly, it can set a range of acceptable frequencies—but the processor itself handles the rest. Rather than just choosing between P0 turbo states, the processor can pick between the full range of P states, from the minimum frequency all the way up to P0. […] This means that the processor is both quicker to react to new work, boosting the frequency as needed, but also much quicker to cut the frequency when idle.
Last month, a leaked Intel slide deck revealed that “Y” series Skylake processors appropriate for the 12-inch Retina MacBook will have up to 17% faster CPU performance, up to 41% faster Intel HD graphics and up to 1.4 hours longer battery life compared to current-generation Core M architecture.
Update (2015-09-02): Ian Cutress (comments):
All of the Core M processors are launching today, as are the i3/i5/i7 models and two new Xeon mobile processors. From a power perspective this means Intel is releasing everything from the 4.5W ultra-mobile Core M through the large 65W desktop models, along with the previously released 91W desktop SKUs.
The most interesting to me is that Intel apparently stopped publishing transistor counts starting with the 14nm node.
This is significant because as structure sizes become smaller, the restrictions on possible layouts (so called DRCs, design rule constraints) become ever stricter. For example, you can't just place wires wherever you want; you have to take into account the relationship to other wires. With stricter rules, the end result may be that the effective scaling achieved is worse than what the structure size suggests, because the rules force a lower density of transistors.
Intel announced a number of new 45-watt “H-Series” processors, but none with the higher-end Iris Pro graphics Apple uses in the 15" Retina MacBook Pro. Skylake H-Series chips with Iris Pro graphics are not expected to launch until early 2016, and Intel has yet to release detailed specs on these chips.