{"id":12760,"date":"2015-11-06T11:15:23","date_gmt":"2015-11-06T16:15:23","guid":{"rendered":"http:\/\/mjtsai.com\/blog\/?p=12760"},"modified":"2015-11-06T11:15:23","modified_gmt":"2015-11-06T16:15:23","slug":"why-is-swifts-string-api-so-hard","status":"publish","type":"post","link":"https:\/\/mjtsai.com\/blog\/2015\/11\/06\/why-is-swifts-string-api-so-hard\/","title":{"rendered":"Why Is Swift&rsquo;s String API So Hard?"},"content":{"rendered":"<p><a href=\"https:\/\/www.mikeash.com\/pyblog\/friday-qa-2015-11-06-why-is-swifts-string-api-so-hard.html\">Mike Ash<\/a>:<\/p>\n<blockquote cite=\"https:\/\/www.mikeash.com\/pyblog\/friday-qa-2015-11-06-why-is-swifts-string-api-so-hard.html\"><p>Incidentally, I think that representing all these different concepts as a single string type is a mistake. Human-readable text, file paths, SQL statements, and others are all conceptually different, and this should be represented as different types at the language level. I think that having different conceptual kinds of strings be distinct types would eliminate a lot of bugs.<\/p><p>[&#8230;]<\/p><p>Swift&rsquo;s <code>String<\/code> type takes a different approach. It has no canonical representation, and instead provides views on various representations of the string. This lets you use whichever representation makes the most sense for the task at hand.<\/p><p>[&#8230;]<\/p><p>Going from an arbitrary sequence of UTF-16 code units back to a <code>String<\/code> is pretty obscure. <code>UTF16View<\/code> has no public initializers and few mutating functions. The solution is to use the global <code>transcode<\/code> function, which works with the <code>UnicodeCodecType<\/code> protocol. There are three implementations of this protocol: <code>UTF8<\/code>, <code>UTF16<\/code>, and <code>UTF32<\/code>. The <code>transcode<\/code> function can be used to convert between them. It&rsquo;s pretty gnarly, though. For the input, it takes a <code>GeneratorType<\/code> which produces the input, and for the output it takes a function which is called for each unit of output. This can be used to build up a string piece by piece by converting to <code>UTF32<\/code>, then converting each <code>UTF-32<\/code> code unit to a <code>UnicodeScalar<\/code> and appending it to a <code>String<\/code>[&#8230;]<\/p><p>[&#8230;]<\/p><p>The various views are all indexable collections, but they are very much <em>not<\/em> arrays. The index types are weird custom <code>struct<\/code>s. This means you can&rsquo;t index views by number [&#8230;] Instead, you have to start with either the collection&rsquo;s <code>startIndex<\/code> or <code>endIndex<\/code>, then use methods like <code>successor()<\/code> or <code>advancedBy()<\/code> to move around [&#8230;] Why not make it easier, and allow indexing with an integer? It&rsquo;s essentially Swift&rsquo;s way of reinforcing the fact that this is an expensive operation.<\/p><\/blockquote>","protected":false},"excerpt":{"rendered":"<p>Mike Ash: Incidentally, I think that representing all these different concepts as a single string type is a mistake. Human-readable text, file paths, SQL statements, and others are all conceptually different, and this should be represented as different types at the language level. I think that having different conceptual kinds of strings be distinct types [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"apple_news_api_created_at":"","apple_news_api_id":"","apple_news_api_modified_at":"","apple_news_api_revision":"","apple_news_api_share_url":"","apple_news_coverimage":0,"apple_news_coverimage_caption":"","apple_news_is_hidden":false,"apple_news_is_paid":false,"apple_news_is_preview":false,"apple_news_is_sponsored":false,"apple_news_maturity_rating":"","apple_news_metadata":"\"\"","apple_news_pullquote":"","apple_news_pullquote_position":"","apple_news_slug":"","apple_news_sections":"\"\"","apple_news_suppress_video_url":false,"apple_news_use_image_component":false,"footnotes":""},"categories":[4],"tags":[46,71,901,258],"class_list":["post-12760","post","type-post","status-publish","format-standard","hentry","category-programming-category","tag-languagedesign","tag-programming","tag-swift-programming-language","tag-unicode"],"apple_news_notices":[],"_links":{"self":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts\/12760","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/comments?post=12760"}],"version-history":[{"count":1,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts\/12760\/revisions"}],"predecessor-version":[{"id":12761,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts\/12760\/revisions\/12761"}],"wp:attachment":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/media?parent=12760"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/categories?post=12760"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/tags?post=12760"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}