ECC RAM
James Hamilton in 2009 (via Accidental Tech Podcast):
An excellent paper was just released that puts hard data behind this point and shows conclusively that ECC is absolutely needed. In DRAM Errors in the Wild: A Large Scale Field Study, Bianca Schroeder, Eduardo Pinheiro, and Wolf-Dietrich Weber show conclusively that you really do need ECC memory in server applications. Wolf was also an author of the excellent Power Provisioning in a Warehouse-Sized Computer that I mentioned in my blog post Slides From Conference on Innovative Data Systems Research where the authors described a technique to over-sub subscribe data center power.
I continue to believe that client systems should also be running ECC and strongly suspect that a great many kernel and device driver failures are actually the result of memory fault. We don’t have the data to prove it conclusively from a client population but I’ve long suspected that the single most effective way for Windows to reduce their blue screen rate would be to require ECC as a required feature for Windows Hardware Certification.