Monday, March 21, 2022

Monterey’s Visual Look Up

Howard Oakley:

If you’ve updated to 12.3, it’s very easy to test. Bring up the image of a painting in one of the supported apps, currently including Safari, Photos and Preview. In Safari, Control-click on the image to produce the contextual menu.

At the foot, you should see (sometimes not immediately) the command Look Up.


A window then pops up over the circle and displays information about the painting, together with a menu of suggested links at the bottom. The amount of detail given in these windows varies considerably.

Howard Oakley:

VLU probably depends on sending Apple’s servers a ‘Neural Hash’ generated from an image, those servers matching it against their database of known images, and returning information about the image for display on your Mac. It’s the last step which appears most important in determining which languages and countries are supported, as someone running their Mac in Japanese isn’t going to find information in English too helpful. Apple therefore needs to localise that returned information, which is going to take time to extend its coverage to more languages. I also suspect the service may have a limited capacity which is being ramped up to cope with more users worldwide.

Howard Oakley (Hacker News):

In its application to CSAM detection, a special protocol is used in conjunction with other techniques to ensure that Apple learns the NeuralHashes only for those images suspected of being CSAM. That appears unnecessary for Visual Look Up, where there shouldn’t be any need for secrecy, other than standard privacy protection.

You can trace VLU at work in the log.


It thus looks like Visual Look Up, for paintings at least, does use part of Apple’s technology intended for CSAM detection. While VLU is a wonderful feature, it looks more like a fortuitous accident and a demonstration of what might come elsewhere, not a goal in itself.


Update (2022-03-23): Howard Oakley:

When looking up images of paintings, VLU works in two phases: in the first, the image is analysed, classified, and any objects within it are detected, in what’s termed a VisionKit Analyzer process. That is reported as complete by the appearance of one or more white dots on the image. The second phase is visual search, in which the NeuralHash or perceptual hash(es) obtained in analysis are then sent to Apple’s servers, and the best-matching results are returned for display as the information about that image or object within it.

Update (2022-04-14): Howard Oakley:

What isn’t readily available elsewhere is identifying the breed of dog, species of flower, or well-known landmarks. This article examines how those are performed in Visual Look Up (VLU), and how they contrast with Live Text.

Howard Oakley:

The hardware in my iPhone XR, and my M1 Macs, is vastly superior to that in my iMac Pro in one respect: since its A12 in 2018, Apple’s own chips incorporate a Neural Engine. This article considers what difference that makes, and how it affects our privacy.

Howard Oakley:

VLU’s call chain is through mediaanalysisd, which uses Espresso to manage the ANE. Espresso is also used on Intel Macs to manage neural networks run on their CPU cores.

Maximum power drawn by the ANE was 49 mW, which is low even in comparison to that required by E cores.

Howard Oakley:

Visual Look Up is one of the features of macOS which uses Machine Learning (ML), and should just work. However, many users have reported that it doesn’t appear to be available on their Mac, or even stranger, that it works on one system but not others. This article explains what it requires, and how you should be able to use it.

Comments RSS · Twitter

Leave a Comment