Methods for Bronchoscopic Video Analysis and Synchronization

Restricted (Penn State Only)
- Author:
- Chang, Qi
- Graduate Program:
- Electrical Engineering
- Degree:
- Doctor of Philosophy
- Document Type:
- Dissertation
- Date of Defense:
- December 19, 2024
- Committee Members:
- Madhavan Swaminathan, Program Head/Chair
Robert Collins, Major Field Member
William Higgins, Chair & Dissertation Advisor
Necdet Aybat, Outside Unit & Field Member
Vishal Monga, Major Field Member - Keywords:
- Lung Cancer
Bronchoscopy
Early Detection
Autofluorescence Bronchoscopy
Bronchial Lesion
Lesion Detection and Segmentation
Multimodal Video Synchronization
Image-Guided Intervention - Abstract:
- Early detection of lung cancer is crucial as it significantly improves survival rates by facilitating timely and effective treatment. Lung cancer often begins as bronchial lesions developing along the airway wall. Bronchoscopy is a minimally invasive and effective method to detect such lesions. Currently, three complementary bronchoscopic video modalities are utilized for this purpose: white-light bronchoscopy (WLB), narrow-band imaging (NBI), and autofluorescence bronchoscopy (AFB). Among these, AFB is the primary modality used in this dissertation due to its high sensitivity in detecting bronchial lesions. However, manual inspection of AFB video is extremely tedious and error-prone due to the absence of effective tools for lesion identification within the large volume of video frames. Furthermore, combining multiple bronchoscopic modalities significantly enhances both sensitivity and specificity compared to using a single modality. Unfortunately, there are no systems and inspection tools that provide synchronized views across these modalities. To address these problems, in this dissertation, we first propose an interactive AFB video analysis subsystem for 1) real-time lesion segmentation and detection in AFB videos to report lesions found and 2) lesion frame assessment that enables interactive selection of the representative frame of each lesion region. Subsequently, we propose a multimodal video synchronization subsystem that enables the synchronized inspection of endoluminal surfaces across these three modalities. In particular, the interactive AFB video analysis subsystem employs a two-phase approach. In the first phase, a real-time lesion segmentation network, named ESFPNet, is developed. Given a patient's AFB video, ESFPNet segments lesions in each video frame. In the second phase, segmented lesion regions are tracked and associated across video sequences, then following a keyframe measurement to enable an interactive selection of representative frames that include the best view of the lesion. Subsequently, these selected frames are used for synchronized inspection of the lesion regions across the three video modalities via the multimodal video synchronization subsystem. The multimodal video synchronization subsystem includes a multimodal registration and synchronization pipeline to align WLB, NBI, and AFB video frames with a reference CT-derived chest model. We present a landmark-enhanced video registration workflow used in the pipeline for efficient registration of each video source to the chest model. Once the pipeline executes this workflow for each video modality, a synchronized data structure is generated, enabling synchronized visualization of the airway surface across modalities. The performance of these subsystems was evaluated through a series of human studies. The interactive AFB video analysis subsystem was tested in AFB airway exam videos from 14 lung cancer patients, successfully identifying 37 lesions and their corresponding representative frames. The multimodal video synchronization subsystem was tested using airway examination videos including three modalities (WLB, NBI, and AFB) from 5 lung cancer patients, and synchronized views of airway surfaces were successfully provided, particularly at lesion sites.