How Both TCP and Ethernet Checksums Fail
Evan Jones (via Hacker News):
At Twitter, a team had a unusual failure where corrupt data ended up in memcache. The root cause appears to have been a switch that was corrupting packets. Most packets were being dropped and the throughput was much lower than normal, but some were still making it through. The hypothesis is that occasionally the corrupt packets had valid TCP and Ethernet checksums. One "lucky" packet stored corrupt data in memcache. Even after the switch was replaced, the errors continued until the cache was cleared.
I was very excited to hear about this error, because it is a real-world example of something I wrote about seven years ago: The TCP checksum is weak. However, the Ethernet CRC is strong, so how could a corrupt packet pass both checks? The answer is that the Ethernet CRC is recalculated by switches. If the switch corrupts the packet and it has the same TCP checksum, the hardware blindly recalculates a new, valid Ethernet CRC when it goes out.
1 Comment RSS · Twitter
There is an RFC that compares different checksumming algorithms which was written when iSCSI was being finalized (~2002):
In this memo, we attempt to give some estimates for the probability
of undetected errors to facilitate the selection of an error
detection code for the Internet Protocol Small Computer System
Interface (iSCSI).We will also attempt to compare Cyclic Redundancy Checks (CRCs) with
other checksum forms (e.g., Fletcher, Adler, weighted checksums), as
permitted by available data.
https://tools.ietf.org/html/rfc3385
Another interesting RFC is how the checksum was changed after SCTP was supposedly “finalized”, but before it became wide spread (relatively speaking, as it’s not really used a lot):
Stream Control Transmission Protocol (SCTP) currently uses an Adler-
32 checksum. For small packets Adler-32 provides weak detection of
errors. This document changes that checksum and updates SCTP to use
a 32 bit CRC checksum.
https://tools.ietf.org/html/rfc3309
They went from Adler-32 to CRC-32c. iSCSI can negotiate the algorithm, but CRC-32c is a MUST in the base RFC.
Back in 2004 Koopman and Chakravarty did a survey for use in embedded networks:
Moreover, even if a good published polynomial is available, there is generally no published guidance on what range of data word lengths it is good for nor, for that matter, quantitative data to help dis- tinguish that good polynomial from any competing pub- lished or standardized “bad” polynomials.
This paper presents the first exhaustive survey of all CRC polynomials from 3 bits to 15 bits, and discusses 16-bit polynomials as well. A methodology for selecting generically “good” CRC polynomials is based on achiev- ing maximum Hamming Distance for the longest possible data word sizes and other performance considerations.
http://users.ece.cmu.edu/~koopman/roses/dsn04/koopman04_crc_poly_embedded.pdf