{"id":34655,"date":"2022-01-08T12:41:50","date_gmt":"2022-01-08T17:41:50","guid":{"rendered":"https:\/\/mjtsai.com\/blog\/?p=34655"},"modified":"2022-01-13T16:57:23","modified_gmt":"2022-01-13T21:57:23","slug":"why-utf8-in-mysql-is-not-utf-8","status":"publish","type":"post","link":"https:\/\/mjtsai.com\/blog\/2022\/01\/08\/why-utf8-in-mysql-is-not-utf-8\/","title":{"rendered":"Why &ldquo;utf8&rdquo; in MySQL Is Not UTF-8"},"content":{"rendered":"<p><a href=\"https:\/\/dev.to\/flokoe\/database-character-sets-and-collations-explained-why-utf8-is-not-utf-8-3h7b\">Florian K&ouml;hler<\/a> (via <a href=\"https:\/\/twitter.com\/KenHatesSoftwar\/status\/1478870833141465088\">Ken Harris<\/a>):<\/p>\n<blockquote cite=\"https:\/\/dev.to\/flokoe\/database-character-sets-and-collations-explained-why-utf8-is-not-utf-8-3h7b\"><p>For whatever reason, a few months later, in September 2002, a MySQL developer decided to push a one-byte commit <a href=\"https:\/\/github.com\/mysql\/mysql-server\/commit\/43a506c0ced0e6ea101d3ab8b4b423ce3fa327d0\">UTF8 now works with up to 3 byte sequences only<\/a> to the repository and change the allowed bytes from six to three.<\/p><p>Since then, the character set called <code>utf8<\/code> has been a crippled and proprietary variation as it neither conforms to the old nor the new definition (<a href=\"https:\/\/datatracker.ietf.org\/doc\/html\/rfc3629\">RFC 3629<\/a>) of UTF-8. The misleading name still causes issues today.<\/p><p>[&#8230;]<\/p><p>To remediate this mistake <a href=\"https:\/\/web.archive.org\/web\/20190201033750\/https:\/\/dev.mysql.com\/doc\/relnotes\/mysql\/5.5\/en\/news-5-5-3.html\">MySQL added the <code>utf8mb4<\/code> charset in version 5.5.3<\/a>. <code>utf8mb4<\/code> fully implements the current standard. Now <code>utf8<\/code> is an alias for <code>utf8mb3<\/code> and will be switched to <code>utf8mb4<\/code>.<\/p><\/blockquote>\n\n<p id=\"why-utf8-in-mysql-is-not-utf-8-update-2022-01-13\">Update (2022-01-13): See also: <a href=\"https:\/\/news.ycombinator.com\/item?id=29907551\">Hacker News<\/a>.<\/p>","protected":false},"excerpt":{"rendered":"<p>Florian K&ouml;hler (via Ken Harris): For whatever reason, a few months later, in September 2002, a MySQL developer decided to push a one-byte commit UTF8 now works with up to 3 byte sequences only to the repository and change the allowed bytes from six to three.Since then, the character set called utf8 has been a [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"apple_news_api_created_at":"2022-01-08T17:41:52Z","apple_news_api_id":"ed412530-3d33-4ada-9724-4e486d4d5abd","apple_news_api_modified_at":"2022-01-13T21:57:26Z","apple_news_api_revision":"AAAAAAAAAAAAAAAAAAAAAA==","apple_news_api_share_url":"https:\/\/apple.news\/A7UElMD0zStqXJE5IbU1avQ","apple_news_coverimage":0,"apple_news_coverimage_caption":"","apple_news_is_hidden":false,"apple_news_is_paid":false,"apple_news_is_preview":false,"apple_news_is_sponsored":false,"apple_news_maturity_rating":"","apple_news_metadata":"\"\"","apple_news_pullquote":"","apple_news_pullquote_position":"","apple_news_slug":"","apple_news_sections":"\"\"","apple_news_suppress_video_url":false,"apple_news_use_image_component":false,"footnotes":""},"categories":[4],"tags":[143,1189,71,258],"class_list":["post-34655","post","type-post","status-publish","format-standard","hentry","category-programming-category","tag-database","tag-mysql","tag-programming","tag-unicode"],"apple_news_notices":[],"_links":{"self":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts\/34655","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/comments?post=34655"}],"version-history":[{"count":2,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts\/34655\/revisions"}],"predecessor-version":[{"id":34701,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts\/34655\/revisions\/34701"}],"wp:attachment":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/media?parent=34655"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/categories?post=34655"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/tags?post=34655"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}