Monday, April 30, 2012

UTF-8 Everywhere

Pavel Radzivilovsky, Yakov Galka, and Slava Novgorodov (via Hacker News):

UTF-16 is the worst of both worlds—variable length and too wide. It exists for historical reasons, adds a lot of confusion and will hopefully die out.

Portability, cross-platform interoperability and simplicity are more important than interoperability with existing platform APIs. So, the best approach is to use UTF-8 narrow strings everywhere and convert them back and forth on Windows before calling APIs that accept strings.

If you’re a Cocoa programmer, be sure you’re familiar with -[NSString rangeOfComposedCharacterSequenceAtIndex:].

Comments RSS · Twitter

Leave a Comment