Why Is ARKit Better Than the Alternatives?
Matt Miesnieks (via iOS Dev Weekly):
To get 3D you need to have 2 views of a scene from different places, in order to do a stereoscopic calculation of your position. This is how our eyes see in 3D, and why some trackers rely on stereo cameras. It’s easy to calculate if you have 2 cameras as you know the distance between them, and the frames are captured at the same time. With one camera, you capture one frame, then move, then capture the second frame. Using IMU Dead Reckoning you can calculate the distance moved between the two frames and then do a stereo calculation as normal (in practice you might do the calculation from more than 2 frames to get even more accuracy). If the IMU is accurate enough this “movement” between the 2 frames is detected just by the tiny muscle motions you make trying to hold your hand still! So it looks like magic.
To get metric scale, the system also relies on accurate Dead Reckoning from the IMU. From the acceleration and time measurements the IMU gives, you can integrate backwards to calculate velocity and integrate back again to get distance traveled between IMU frames. The maths isn’t hard. What’s hard is removing errors from the IMU to get a near perfect acceleration measurement. A tiny error, which accumulates 1000 time a second for the few seconds that it takes you to move the phone, can mean metric scale errors of 30% or more. The fact that Apple has got this down to single digit % error is impressive.
[…]
Google also could easily have shipped Tango’s VIO system in a mass market Android phone over 12 months ago, but they also chose not to. If they did this, then ARKit would have looked like a catch up, instead of a breakthrough. I believe (without hard confirmation) that this was because they didn’t want to have to go through a unique sensor calibration process for each OEM, where each OEMs version of Tango worked not as well as others, and Google didn’t want to just favor the handful of huge OEMs (Samsung, Huawei etc) where the device volumes would make the work worthwhile. Instead they pretty much told the OEMs “this is the reference design for the hardware, take it or leave it”.
[…]
So ultimately the reason ARKit is better is because Apple could afford to do the work to tightly couple the VIO algorithms to the sensors and spend *a lot* of time calibrating them to eliminate errors / uncertainty in the pose calculations.