{"id":35971,"date":"2022-05-24T15:04:32","date_gmt":"2022-05-24T19:04:32","guid":{"rendered":"https:\/\/mjtsai.com\/blog\/?p=35971"},"modified":"2022-05-24T15:04:32","modified_gmt":"2022-05-24T19:04:32","slug":"the-apple-gpu-and-the-impossible-bug","status":"publish","type":"post","link":"https:\/\/mjtsai.com\/blog\/2022\/05\/24\/the-apple-gpu-and-the-impossible-bug\/","title":{"rendered":"The Apple GPU and the Impossible Bug"},"content":{"rendered":"<p><a href=\"https:\/\/rosenzweig.io\/blog\/asahi-gpu-part-5.html\">Alyssa Rosenzweig<\/a> (<a href=\"https:\/\/news.ycombinator.com\/item?id=31367365\">Hacker News<\/a>):<\/p>\n<blockquote cite=\"https:\/\/rosenzweig.io\/blog\/asahi-gpu-part-5.html\"><p>The buffer we&rsquo;re chasing, the &ldquo;tiled vertex buffer&rdquo;, can overflow. To cope, the GPU stops accepting new geometry, renders the existing geometry, and restarts rendering.\n\nSince partial renders hurt performance, Metal application developers need to know about them to optimize their applications. <\/p><p>[&#8230;]<\/p><p>When a partial render is possible, there are two &ldquo;load&rdquo; programs. One writes the clear colour or loads the framebuffer, depending on the application setting. We understand this one. The other <em>always<\/em> loads the framebuffer.<\/p><p>&#8230;Always loads the framebuffer, as in, for loading back with a partial render even if there is a clear at the start of the frame?<\/p><p>[&#8230;]<\/p><p>Doing so, Metal fails in a similar way. That means we&rsquo;re at the root cause. Looking at our own driver code, we don&rsquo;t specify <em>any<\/em> program for this partial render load. Up until now, that&rsquo;s worked okay. If the parameter buffer is never overflowed, this program is unused. As soon as a partial render is required, however, failing to provide this program means the GPU dereferences a null pointer and faults. That explains our GPU faults at the beginning.<\/p><\/blockquote>\n\n<p>Previously:<\/p>\n<ul>\n<li><a href=\"https:\/\/mjtsai.com\/blog\/2021\/01\/11\/dissecting-the-apple-m1-gpu\/\">Dissecting the Apple M1 GPU<\/a><\/li>\n<\/ul>","protected":false},"excerpt":{"rendered":"<p>Alyssa Rosenzweig (Hacker News): The buffer we&rsquo;re chasing, the &ldquo;tiled vertex buffer&rdquo;, can overflow. To cope, the GPU stops accepting new geometry, renders the existing geometry, and restarts rendering. Since partial renders hurt performance, Metal application developers need to know about them to optimize their applications. [&#8230;]When a partial render is possible, there are two [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"apple_news_api_created_at":"2022-05-24T19:04:35Z","apple_news_api_id":"40bd386f-b80e-426b-980a-e16d6d9c17a0","apple_news_api_modified_at":"2022-05-24T19:04:35Z","apple_news_api_revision":"AAAAAAAAAAD\/\/\/\/\/\/\/\/\/\/w==","apple_news_api_share_url":"https:\/\/apple.news\/AQL04b7gOQmuYCuFtbZwXoA","apple_news_coverimage":0,"apple_news_coverimage_caption":"","apple_news_is_hidden":false,"apple_news_is_paid":false,"apple_news_is_preview":false,"apple_news_is_sponsored":false,"apple_news_maturity_rating":"","apple_news_metadata":"\"\"","apple_news_pullquote":"","apple_news_pullquote_position":"","apple_news_slug":"","apple_news_sections":"\"\"","apple_news_suppress_video_url":false,"apple_news_use_image_component":false,"footnotes":""},"categories":[4],"tags":[2014,131,30,2077,906,71],"class_list":["post-35971","post","type-post","status-publish","format-standard","hentry","category-programming-category","tag-apple-m1","tag-bug","tag-mac","tag-macos-12","tag-metal","tag-programming"],"apple_news_notices":[],"_links":{"self":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts\/35971","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/comments?post=35971"}],"version-history":[{"count":1,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts\/35971\/revisions"}],"predecessor-version":[{"id":35972,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts\/35971\/revisions\/35972"}],"wp:attachment":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/media?parent=35971"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/categories?post=35971"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/tags?post=35971"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}