Why GNU grep Is Fast
Moreover, GNU grep AVOIDS BREAKING THE INPUT INTO LINES. Looking for newlines would slow grep down by a factor of several times, because to find the newlines it would have to look at every byte!
So instead of using line-oriented input, GNU grep reads raw data into a large buffer, searches the buffer using Boyer-Moore, and only when it finds a match does it go and look for the bounding newlines. (Certain command line options like -n disable this optimization.)
1 Comment RSS · Twitter
Nice post. I knew the basics, but this post actually made me look up and read wikipedia on Boyer-Moore, very interesting.