Neural Networks for Robust Eye Tracking in Real World and Virtual Environments
Contemporary algorithms for video-based eye tracking involve placing near-infrared cameras close to the eye. Computer vision or trained neural networks for feature detection track the movement of features in the eye images, such as the pupil/iris boundary, or pupil center. These tracked features are then fed into a model for the estimation of gaze direction. We have pioneered, and continue to develop, algorithms for the detection of features in eye tracker imagery, and the most successful, RITnet2 recently won the OpenEDS challenge, organized by Facebook. To overcome the time-costly and laborious process of hand-labelling the training set of labelled eye imagery, the supervised networks are trained using synthetic imagery with pixel-level semantic ground-truth reconstructions of previously recorded gaze behavior. The rendered imagery approximates the camera positioning and properties used in contemporary eye trackers, and the human avatars used during rendering are representative of individual differences, including those spanning gender, age, skin tone, and age. These techniques promise a future in which video-based mobile eye trackers work reliably, despite environmental degradations to eye imagery, and despite individual differences in appearance.
Reynold Bailey, Gabriel J. Diaz, Jeff Pelz