Friday, September 1, 2017

Decoding NSASCIIStringEncoding Is Not Strict

Jeff Johnson:

The documentation for NSASCIIStringEncoding is clearly false: “Strict 7-bit ASCII encoding within 8-bit chars; ASCII values 0…127 only.” The NSString.h header file contains the same falsehood:

NSASCIIStringEncoding = 1, /* 0..127 only */

Curiously, though, the CFString.h header file has a more useful comment:

kCFStringEncodingASCII = 0x0600, /* 0..127 (in creating CFString, values
greater than 0x7F are treated as corresponding Unicode value) */

Another oddity with the documentation for NSString decoding: it only says that -initWithBytes:length:encoding: returns nil if the byte string is too long, but it says that -initWithData:encoding: returns nil if the data is not valid for the encoding. You would think these methods would be consistent.

However, I have found that NSASCIIStringEncoding does work as expected when encoding. That is, if the string is non-ASCII it will give you a nil data unless you request a lossy conversion.

1 Comment RSS · Twitter

It appears from disassembling Foundation framework that initWithData:encoding: simply calls through to initWithBytes:length:encoding: using the data bytes and length. The return value documentation for initWithBytes:length:encoding: doesn't make a lot of sense to me. I suspect those docs are messed up.

Leave a Comment