{"id":40676,"date":"2023-09-18T15:20:34","date_gmt":"2023-09-18T19:20:34","guid":{"rendered":"https:\/\/mjtsai.com\/blog\/?p=40676"},"modified":"2023-09-18T15:21:10","modified_gmt":"2023-09-18T19:21:10","slug":"apples-new-transformer-powered-predictive-text-model","status":"publish","type":"post","link":"https:\/\/mjtsai.com\/blog\/2023\/09\/18\/apples-new-transformer-powered-predictive-text-model\/","title":{"rendered":"Apple&rsquo;s New Transformer-Powered Predictive Text Model"},"content":{"rendered":"<p><a href=\"https:\/\/jackcook.com\/2023\/09\/08\/predictive-text.html\">Jack Cook<\/a> (via <a href=\"https:\/\/news.ycombinator.com\/item?id=37541093\">Hacker News<\/a>):<\/p>\n<blockquote cite=\"https:\/\/jackcook.com\/2023\/09\/08\/predictive-text.html\"><p>The feature will occasionally suggest more than one word at a time, but this is generally limited to instances where the upcoming words are extremely obvious, similar to the autocomplete in Gmail.<\/p><p>[&#8230;]<\/p><p>I have to say that this vocabulary file strikes me as pretty unique, but it&rsquo;s definitely not out of the question for a language model deployed in this setting.\nI&rsquo;ve personally never seen emojis featured so prominently in a language model&rsquo;s tokenizer, but <a href=\"https:\/\/arxiv.org\/abs\/2007.15779\">existing<\/a> <a href=\"https:\/\/arxiv.org\/abs\/2303.17564\">research<\/a> has shown that domain-specific models and tokenizers can drastically improve downstream model performance.\nSo it makes sense that a model trained for use in things like text messages, in which emojis and contractions will be used a lot, would prioritize them.<\/p><p>[&#8230;]<\/p><p>GPT-2 has four main parts: token embeddings, positional encodings, a series of 12-48 decoder blocks, and an output layer.\nThe network described by <code>unilm_joint_cpu<\/code> appears to be the same, except with only 6 decoder blocks.\nMost of the layers within each decoder block have names like <code>gpt2_transformer_layer_3d<\/code>, which would also seem to suggest it&rsquo;s based on a GPT-2 architecture.<\/p><p>From my calculations based on sizes of each layer, Apple&rsquo;s predictive text model appears to have about 34 million parameters, and it has a hidden size of 512 units.\nThis makes it much smaller than even the smallest version of GPT-2.<\/p><\/blockquote>\n<p>The early reports about auto-correct in iOS 17 and macOS 14 seem to be positive. I&rsquo;m cautiously optimistic that it will fix the biggest problems for me with the old system, which are that it suggests words that are not spelled correctly and even changes correct words that I entered into mistakes.<\/p>\n\n<p>Previously:<\/p>\n<ul>\n<li><a href=\"https:\/\/mjtsai.com\/blog\/2023\/09\/18\/ios-17\/\">iOS 17<\/a><\/li>\n<li><a href=\"https:\/\/mjtsai.com\/blog\/2023\/07\/13\/macos-14-sonoma-public-beta\/\">macOS 14 Sonoma Public Beta<\/a><\/li>\n<li><a href=\"https:\/\/mjtsai.com\/blog\/2023\/06\/01\/wwdc-2023-wish-lists\/\">WWDC 2023 Wish Lists<\/a><\/li>\n<li><a href=\"https:\/\/mjtsai.com\/blog\/2022\/04\/29\/autocorrect-explained\/\">Autocorrect Explained: Why Your iPhone Adds Annoying Typos While Fixing Others<\/a><\/li>\n<\/ul>","protected":false},"excerpt":{"rendered":"<p>Jack Cook (via Hacker News): The feature will occasionally suggest more than one word at a time, but this is generally limited to instances where the upcoming words are extremely obvious, similar to the autocomplete in Gmail.[&#8230;]I have to say that this vocabulary file strikes me as pretty unique, but it&rsquo;s definitely not out of [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"apple_news_api_created_at":"2023-09-18T19:20:37Z","apple_news_api_id":"3a282c8d-2e3a-4eb2-b645-6e4ccf7832fd","apple_news_api_modified_at":"2023-09-18T19:21:12Z","apple_news_api_revision":"AAAAAAAAAAAAAAAAAAAAAA==","apple_news_api_share_url":"https:\/\/apple.news\/AOigsjS46TrK2RW5Mz3gy_Q","apple_news_coverimage":0,"apple_news_coverimage_caption":"","apple_news_is_hidden":false,"apple_news_is_paid":false,"apple_news_is_preview":false,"apple_news_is_sponsored":false,"apple_news_maturity_rating":"","apple_news_metadata":"\"\"","apple_news_pullquote":"","apple_news_pullquote_position":"","apple_news_slug":"","apple_news_sections":"\"\"","apple_news_suppress_video_url":false,"apple_news_use_image_component":false,"footnotes":""},"categories":[2],"tags":[1351,1623,257,31,2321,30,2223],"class_list":["post-40676","post","type-post","status-publish","format-standard","hentry","category-technology","tag-artificial-intelligence","tag-auto-correction","tag-emoji","tag-ios","tag-ios-17","tag-mac","tag-macos-13-ventura"],"apple_news_notices":[],"_links":{"self":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts\/40676","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/comments?post=40676"}],"version-history":[{"count":2,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts\/40676\/revisions"}],"predecessor-version":[{"id":40680,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts\/40676\/revisions\/40680"}],"wp:attachment":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/media?parent=40676"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/categories?post=40676"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/tags?post=40676"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}