Tuesday, May 24, 2022

The Apple GPU and the Impossible Bug

Alyssa Rosenzweig (Hacker News):

The buffer we’re chasing, the “tiled vertex buffer”, can overflow. To cope, the GPU stops accepting new geometry, renders the existing geometry, and restarts rendering. Since partial renders hurt performance, Metal application developers need to know about them to optimize their applications.

[…]

When a partial render is possible, there are two “load” programs. One writes the clear colour or loads the framebuffer, depending on the application setting. We understand this one. The other always loads the framebuffer.

…Always loads the framebuffer, as in, for loading back with a partial render even if there is a clear at the start of the frame?

[…]

Doing so, Metal fails in a similar way. That means we’re at the root cause. Looking at our own driver code, we don’t specify any program for this partial render load. Up until now, that’s worked okay. If the parameter buffer is never overflowed, this program is unused. As soon as a partial render is required, however, failing to provide this program means the GPU dereferences a null pointer and faults. That explains our GPU faults at the beginning.

Previously:

3 Comments RSS · Twitter

BTW, is it "normal", that I cannot view any of the .webp images in High Sierra? Fails with any app that I found on my Mac for it (Preview, Acorn, CIFH)

Nevermind - webp is apparently not popular with Apple, as explained here: https://apple.stackexchange.com/a/392466/17533
Chome works, though.

It looks like there are issues, but this was important early context:

Since then, we’ve been reverse-engineering AGX and building open source graphics drivers. Last January, I rendered a triangle with my own code, but there has since been a heinous bug lurking:

So afaict the bug in question is, to date, in the driver, not the chip. (Though, again, the full article provides much better context and some possible chip shortcomings.)

That sound right?

Leave a Comment