Deep Learning Improves Detection of Oropharyngeal Carcinoma using Multispectral Narrow-Band Endoscopic Imaging: Final Results from a Prospective Clinical Study

Presentation: A008
Topic: Cancer Engineering / Technology
Type: Poster
Authors: Chris Holsinger, Professor Head and Neck Surgery; Nikhil V Raghuraman; Julia X Gong; Serena Yeung, Assistant Professor Computer Science
Institution(s): Stanford University


Artificial intelligence (AI) is transforming clinical medicine. Computer vision, the field of AI focused on automatic interpretation of visual data, is already being used in radiology, pathology and dermatology, where standardized imaging methods and large datasets of annotated images are an essential element of the clinical workflow.  Laryngopharyngeal endoscopy is an essential element of tumor staging for the head and neck oncologist faces many visual tissue discrimination tasks that remain challenging even when aided by optical magnification in discerning oropharyngeal carcinoma.  Deep learning may be able to improve tumor staging and thus improve outcomes.


We hypothesize that there is clinically valuable and objective visual information that can be discerned using multispectral imaging of human tissues.  The purpose of this study was to demonstrate OPC discrimination by applying machine learning and computer vision algorithms to multispectral images of tissue acquired with Multispectral Narrow Band Endoscopic Imaging

Materials and Methods: Multispectral narrow-band imaging (msNBI) was used to examine the lymphoepithelial tissues of the oropharynx in a prospective study of 60 patients (30 with biopsy-proven OPC, 30 patients without OPC). From each high-definition video, recorded at 30 fps, an average of 8 standard images. An end-to-end pipeline for data ingestion, processing, training, and inference was created. We conducted k-fold cross-validation with k=5 on all steps of the training and inference pipeline for both the cropping model.  Experienced surgeons employed a web-based annotation tool, along with computer tablets and styluses (Wacom Co., Ltd; Saitama, Japan), to annotate fine-grained manual segmentations. A cropping model employing a ResNeXt50-32x4d convolutional neural network backbone was then used to predict tumor versus normal, using histologic ground-truth of oropharyngeal biopsy.

Results: When trained on both color and texture features, the neural network classified OPC under msNBI with 92% accuracy and 0.9497 area under the ROC curve (AUC), exceeding the performance of our prior model and a variety of other commonly-used supervised machine learning algorithms (Naïve Bayes, support vector machine, etc.; accuracy ranging from 81 to 87%). Texture features were essential to the model’s performance, as a neural network trained on color features only detected OPC with 63.3% accuracy.

Conclusions and Relevance: Our study is the first to demonstrate the feasibility of deep learning  to classify and to objectively distinguish clinically significant differences between tumor and normal using narrow-band imaging.  We further hypothesize that there is characteristic visual information based on the molecular composition of tissues that may aid physicians and surgeons in clinical diagnosis and treatment of OPC.