Sunday, August 3, 2003

Tokenizing in Cocoa

I don’t have numbers, but I was somewhat surprised to discover that, for my use, it’s faster to convert a string to a UTF-8 buffer and feed it to PCRE than it is to use NSScanner and NSCharacterSet. If you do end up using NSScanner, be sure to use -scanUpToCharactersFromSet:intoString: rather than -scanCharactersFromSet:intoString:. The latter spends a lot of time inverting the character set.

Comments RSS · Twitter

Leave a Comment