We have a sensor suite that we've used to integrate the systems, but you could do this for any system that combines a camera and a set of inertial sensors. So that is to say, any smart phone in the world or tablet.
So, you've got the camera-inertial navigation assembly that's been calibrated and you need to calibrate that to a display. The way we've done historically is a camera positioned behind the display, but what we're doing now is allowing the users to do that themselves. If you take the image from the camera and reproject that on the display, then your system is calibrated when it lines up with what you see in the real world.
I know you're familiar with the Q-Warrior, that's our inertial sensor plus camera suite. When the sensors are fixed to the display, you'd only need to do that calibration once.
So, one issue it seems with these systems is the lag between people's real-world vision and the software displaying stuff on top of that world. How have you tackled latency?
ROBERTS: One of the big things is that these displays have an inherent latency attached to them. Let's say you did have a sensor module attached to the display, and let's say you were wearing it and you turned your head. There is going to be a certain amount of time—and it's on the order of 50ms—where the sensors, when they sense that motion, it might take that amount of time just to render it on the display. So, the way that we overcome that basically involves trying to predict what the motion is going to be, or what the motion entails during that 50ms of time. If we can forward-predict in time what the sensor signals would be, then we can render information on the display that will match the environment.
ALBERICO: Latency is something that is always there. You can't eliminate it because you have the ground truth behind the glass—reality—and it is instantaneous. You need time to transfer measurements from the sensor acquisition, process, and render them. 40-50ms is typically what we see.
But as soon as the user starts moving, even a tiny movement, we go through and we start predicting what will happen next based on that little motion. So, we send to the renderer, the position we predict for 40-50ms into the future, so that by the time it gets there, it's what it needs to be for the symbology to be properly aligned.
ROBERTS: At the end of the day, all that means is that when I turn my head, the icon stays locked on the real world and I don't see that lag effect. That's one thing we've been able to do very well. In a lot of the AR things out there, there are a lot of issues out there.
To do that prediction, is that a thing where you put tons of measurements into a machine learning model and crank it out?
ALBERICO: We actually have started looking at what can you say about the human motion that would allow you to make better predictions. That's under development, too. But right now what's working is an extrapolation of the motion. As soon as we detect a bit of motion, we extrapolate that into the future. If the user keeps up with that bit of motion for 40-50ms then the outcome is that the icon will be locked.
ROBERTS: We do it without just throwing more comp cycles at the problem. We're not just throwing more computational power at the problem, but using it more intelligently.