Training deep neural networks for small and highly heterogeneous MRI datasets for cancer grading

Wodzinski, Marek (AGH University of Science and Technology) ; Tommaso, Banzato (University of Padova, Italy) ; Atzori, Manfredo (University of Applied Sciences and Arts Western Switzerland (HES-SO Valais-Wallis)) ; Andrearczyk, Vincent (University of Applied Sciences and Arts Western Switzerland (HES-SO Valais-Wallis)) ; Dicente Cid, Yashin (University of Warwick, UK) ; Müller, Henning (University of Applied Sciences and Arts Western Switzerland (HES-SO Valais-Wallis) ; University of Geneva, Switzerland)

Using medical images recorded in clinical practice has the potential to be a game-changer in the application of machine learning for medical decision support. Thousands of medical images are produced in daily clinical activity. The diagnosis of medical doctors on these images represents a source of knowledge to train machine learning algorithms for scientific research or computer-aided diagnosis. However, the requirement of manual data annotations and the heterogeneity of images and annotations make it difficult to develop algorithms that are effective on images from different centers or sources (scanner manufacturers, protocols, etc.). The objective of this article is to explore the opportunities and the limits of highly heterogeneous biomedical data, since many medical data sets are small and entail a challenge for machine learning techniques. Particularly, we focus on a small data set targeting meningioma grading. Meningioma grading is crucial for patient treatment and prognosis. It is normally performed by histological examination but recent articles showed that it is possible to do it also on magnetic resonance images (MRI), so non-invasive. Our data set consists of 174 T1-weighted MRI images of patients with meningioma, divided into 126 benign and 48 atypical/anaplastic cases, acquired using 26 different MRI scanners and 125 acquisition protocols, which shows the enormous variability in the data set. The performed preprocessing steps include tumor segmentation, spatial image normalization and data augmentation based on color and affine transformations. The preprocessed cases are passed to a carefully trained 2-D convolutional neural network. Accuracy above 74% was obtained, with the high-grade tumor recall above 74%. The results are encouraging considering the limited size and high heterogeneity of the data set. The proposed methodology can be useful for other problems involving classification of small and highly heterogeneous data sets.

Note: Due to the COVID-19 outbreak, the 42nd Annual International Conferences of the IEEE Engineering in Medicine and Biology Society venue in Montréal was cancelled. The proceedings of the online conference are however published according to the original schedule.

Conference Type:
short paper
Economie et Services
Institut Informatique de gestion
Montréal, Canada, 20-24 July 2020
Montréal, Canada
20-24 July 2020
4 p.
Published in:
Proceedings of the 42nd Annual International Conferences of the IEEE Engineering in Medicine and Biology Society
Appears in Collection:

Note: The status of this file is: restricted

 Record created 2020-12-01, last modified 2020-12-04

Download fulltext

Rate this document:

Rate this document:
(Not yet reviewed)