Imaging Science Ph.D. Defense: Rakshit Kothari

Event Image
imaging science phd defense rakshit kothari

Ph.D. Dissertation Defense
Towards Robust Gaze Estimation and Classification in Naturalistic Conditions

Rakshit Kothari
Imaging Science Ph.D. Candidate
Chester F. Carlson Center for Imaging Science, RIT

Register Here for Zoom Link

Abstract:
Eye movements help us identify when and where we are fixating. The location under fixation is a valuable source of information in decoding a person’s intent or as an input modality for human-computer interaction. However, it can be difficult to maintain fixation under motion unless our eyes compensate for body movement. Humans have evolved compensatory mechanisms using the vestibulo-ocular reflex pathway which ensures stable fixation under motion. The interaction between the vestibular and ocular system has primarily been studied in controlled environments, with comparatively few studies during natural tasks that involve coordinated head and eye movements under unrestrained body motion. To address these issues we developed algorithms for gaze event classification and collected the Gaze-in-Wild (GW) dataset. However, reliable inference of human behavior depends heavily on the quality of gaze data extracted from eyetrackers. State of the art gaze estimation algorithms can be easily affected by occluded eye features, askew eye camera orientation and reflective artifacts from the environments. To inculcate robustness to reflective artifacts, my efforts helped develop RITNet, a neural network which segments eye images into semantic parts such as pupil, iris and sclera. Well chosen data augmentation techniques and objective functions combat reflective artifacts and helped RITNet achieve first place in the OpenEDS’19 challenge. To induce robustness to occlusions, my efforts resulted in a novel eye image segmentation protocol, EllSeg. EllSeg demonstrates state of the art pupil and iris detection despite the presence of reflective artifacts and occlusions. Neural networks are prone to overfitting and do not generalize well beyond the data it was trained on. To explore the generalization capacity of EllSeg, we acquire eye images from multiple datasets and developed EllSeg-Gen, a domain generalization framework for segmenting eye imagery. We find that jointly training a network with multiple datasets improves generalization for eye images acquired outdoors. In contrast, selection of an ideal model from a pool of specialized, dataset-specific models is better suited for indoor eyetracking.

Intended Audience:
Undergraduates, graduates, and experts. Those with interest in the topic.
To request an interpreter, please visit https://myaccess.rit.edu


Contact
Beth Lockwood
Event Snapshot
When and Where
August 23, 2021
11:00 am - 12:00 pm
Room/Location: See Zoom Registration Link
Who

Open to the Public

Interpreter Requested?

No

Topics
imaging science
research