Monday, March 24, 2008

RegexKitLite

RegexKitLite, from the developer of RegexKit, uses the ICU engine instead of PCRE and has a highly simplified API. This makes it very small (since ICU is built into Mac OS X) and gives it better support for Unicode. It’s also potentially more efficient, since ICU can often use an NSString’s UTF-16 buffer rather than creating a separate UTF-8 buffer, as would be required for PCRE. ICU’s regular expression syntax is not as rich as PCRE’s, however.

1 Comment RSS · Twitter

Note that Apple has discouraged use of the included ICU, that's why it doesn't come with headers. I hope they do begin to support ICU at some point, since it has many useful utilities. Character encoding detection, for one, would be very useful when dealing with legacy data.

Leave a Comment