Imaging Science Thesis Defense: Toward Vision Intelligence-Based Liver Surgery

Event Image
Imaging Science Ph.D. Defense

Imaging Science Ph.D. Defense
Toward Vision Intelligence-Based Liver Surgery

Zixin Yang
Imaging Science Ph.D. Candidate
Rochester Institute of Technology                                   

Register for Zoom here

Abstract
:

Despite advancements in surgical interventions, modern procedures still heavily rely on surgeons' expertise, requiring extensive training while maintaining limited accuracy. Although image guidance systems have been developed, “GPS-like” surgical navigation systems have yet to become standard practice due to their high costs and accuracy limitations. This thesis aims to enhance surgical navigation and mitigate some if its current limitations by leveraging vision intelligence: integrating image processing, modeling, and computer vision to extract rich, underlying information from images. The key to achieving these improvements lies in improving surgical perception, which involves understanding the spatial relationships between the endoscopic camera, surgical instruments, and surgical targets. This research contributes to four fundamental vision tasks for surgical navigation: depth perception, endoscope tracking, surgical instrument identification and segmentation, and registration of pre- and intraoperative data.
Traditional depth perception methods struggle with featureless tissue surfaces, while learning-based approaches often require ground truth depth, which is difficult to obtain. Additionally, learning-based methods may lack robustness in cross-domain applications, and endoscopic images with calibrated camera parameters are rare. To address these challenges, we introduce an unsupervised optical flow-based depth estimation method for stereo endoscopes, eliminating the need for ground truth depth and camera calibration during training. Furthermore, a disparity framework is proposed, incorporating physical constraints and learning-based priors to enhance the accuracy of depth estimation in cross-domain settings.
Endoscopic imaging features a limited field of view, making it difficult to track the camera pose and accurately align intraoperative point clouds from different perspectives. While hardware-based tracking systems provide potential solutions, they require additional instrumentation and add invasiveness. To overcome these limitations, this work proposes a hybrid framework that combines learning-based dense depth estimation with visual odometry, enabling precise endoscope tracking and surgical scene reconstruction.
State-of-the-art methods for surgical instrument identification and segmentation depend on supervised learning with pixel-level dense annotations, which are labor-intensive to obtain. To reduce annotation dependency, in this thesis we explore a scribble-based, weakly supervised approach that serves as a precursor to more efficient and scalable surgical instrument segmentation.
Lastly, existing image-guided surgery systems rely on manually performed rigid registration, which is both error-prone and time-consuming. To improve registration accuracy and efficiency, this work investigates the use of learning-based feature descriptors for automatic rigid registration. Beyond rigid registration, non-rigid registration is critical for correcting tissue deformations and ensuring accurate mapping of preoperative structures, such as tumors and vessels, to intraoperative scenes. To this end, we propose a biomechanical-model-based non-rigid registration method that offers a simplified formulation, simple hyperparameter choosing, and improved accuracy, all without requiring manual interaction.
This research advances the integration of vision intelligence into surgical navigation and addresses key challenges in surgical perception, paving the way for more accurate, cost-effective, and accessible image-guided surgery systems.

Intended Audience:
All are Welcome!

To request an interpreter, please visit myaccess.rit.edu


Contact
Lori Hyde
Event Snapshot
When and Where
May 12, 2025
9:00 am - 11:30 am
Room/Location: via Zoom
Who

This is an RIT Only Event

Interpreter Requested?

No

Topics
research