Friday, June 11, 2021

Live Text

Tim Hardwick:

In iOS 15, Apple is introducing a new feature called Live Text that can recognize text when it appears in your camera’s viewfinder or in a photo you’ve taken and let you perform several actions with it.

For example, Live Text allows you to capture a phone number from a storefront with the option to place a call, or look up a location name in Maps to get directions. It also incorporates optical character recognition, so you can search for a picture of a handwritten note in your photos and save it as text.

This looks really cool. I find it hard to believe that Intel Macs aren’t fast enough to support it, though. And why doesn’t it work in Preview?

Previously:

Update (2021-06-18): Kawaljit Singh Bedi:

macOS Monterey OCR works on captchas also.

Update (2021-07-27): Sami Fathi:

The latest beta update of macOS Monterey, released to developers today, has brought Live Text functionality to Intel-based Mac computers, removing the requirement for users to use an M1 Apple silicon Mac to utilize the feature, according to Rene Ritchie.

7 Comments RSS · Twitter

As is pointed out in a footnote on the macOS 12 preview page, that feature requires the Neural Engine processing, that only comes with M1 Macs.

Idem on iPhone, a note on the IOS 15 preview page:
Available on iPhone with A12 Bionic and later (https://www.apple.com/ios/ios-15-preview/#footnote-2).

@Philippe That’s interesting but not a full explanation. Apple’s machine learning APIs (and even the OCR one) do work on iPhones and Macs without neural engines. Is Live Text built on at totally different system?

Kevin Schumacher

My understanding is that there are similar instruction sets in Intel chips that Apple could target, but--and this is where the speculation is--why would they expend the effort and money to do so when they're already moving away from Intel?

5 years ago, I could launch the Google Translate app on my iPhone, take a photo from a sign or book, then swipe over lines of text in that photo to get them translated. Not sure though whether it did the text recognition locally or remote - the translation itself probably was remote.

Jean-Daniel

@Kevin Schumacher: That's my though too. This is the same for FaceTime blur background. It is probably based on the Image Processing Pipeline of the M1 and they can definitively write an other version that would work on Intel machines, but is it worth doing it ?

Google has had similar functionality in the Lens/Translate/Camera app over the years - so it can clearly be done on lower powered devices.
As per normal; Apple have done a _much_ better job of packaging/presenting the feature.

Google did the text recognition by sending the image to their data centers. They didn't do it on device.

Leave a Comment