Robust Video Frame Classification in Bronchoscopy

Open Access
Mctaggart, Matthew Ira
Graduate Program:
Electrical Engineering
Master of Science
Document Type:
Master Thesis
Date of Defense:
July 18, 2018
Committee Members:
  • William E. Higgins, Thesis Advisor
  • William K. Jenkins, Committee Member
  • bronchoscopy
  • endobronchial video summarization
  • lung cancer
  • image-guided procedures
  • endoscopic procedures
  • machine learning
  • artificial intelligence
  • surgical application
  • interventional application
During bronchoscopy, a physician uses the endobronchial video to help navigate and observe the inner airways of a patient's lungs for lung cancer assessment. After the procedure is completed, the video typically contains a significant number of uninformative segments. The video frames become uninformative when the video becomes too dark, too blurry, or indistinguishable due to a build-up of mucus, blood, or water within the airways. In this thesis, we developed a robust and automatic method to classify each frame in an endobronchial video sequence as informative or uninformative. In addition, we accessed the efficacy and significance of quantitative measures for labeling uninformative frames. Finally, we implemented a method to generate an optional specular-reflection-masked video sequence to indicate regions of the video subjected to specular reflection. To develop our frame-classification system, we considered two approaches. Our first approach, referred as the Classifier Approach, focused on using image-processing techniques, while our second approach, the Deep-Learning Approach, utilized deep learning for video frame classification. Each approach had four steps and processed the video, frame by frame. The first step in our method was to preprocess a video frame. Afterwards, a feature vector was extracted. Next, the video frame was classified from the feature vector through the use of a support vector machine for the Classifier Approach, and a softmax classifier for the Deep-Learning Approach. Finally as the last step, the optional specular-reflection mask was computed. Using the Classifier Approach, we achieved an accuracy of 78.8%, a sensitivity of 93.9%, and a specificity of 62.8%. The Deep-Learning Approach, gave slightly improved performance, with an accuracy of 87.3%, a sensitivity of 87.1%, and a specificity of 87.6%. We concluded that edge-based image-processing techniques are not sufficient to discriminate between informative and uninformative video frames in bronchoscopy.