Why “utf8” in MySQL Is Not UTF-8
Florian Köhler (via Ken Harris):
For whatever reason, a few months later, in September 2002, a MySQL developer decided to push a one-byte commit UTF8 now works with up to 3 byte sequences only to the repository and change the allowed bytes from six to three.
Since then, the character set called
utf8
has been a crippled and proprietary variation as it neither conforms to the old nor the new definition (RFC 3629) of UTF-8. The misleading name still causes issues today.[…]
To remediate this mistake MySQL added the
utf8mb4
charset in version 5.5.3.utf8mb4
fully implements the current standard. Nowutf8
is an alias forutf8mb3
and will be switched toutf8mb4
.
Update (2022-01-13): See also: Hacker News.