Exploiting biomedical literature to mine out a large multimodal dataset of rare cancer studies

Dhrangadhariya, Anjani (University of Applied Sciences and Arts Western Switzerland (HES-SO Valais-Wallis)) ; Jiménez del Toro, Oscar Alfonso (University of Applied Sciences and Arts Western Switzerland (HES-SO Valais-Wallis)) ; Andrearczyk, Vincent (University of Applied Sciences and Arts Western Switzerland (HES-SO Valais-Wallis)) ; Atzori, Manfredo (University of Applied Sciences and Arts Western Switzerland (HES-SO Valais-Wallis)) ; Müller, Henning (University of Applied Sciences and Arts Western Switzerland (HES-SO Valais-Wallis) ; University of Geneva, Switzerland)

The overall lower survival rate of patients with rare cancers can be explained, among other factors, by the limitations resulting from the scarce available information about them. Large biomedical data repositories, such as PubMed Central Open Access (PMC-OA), have been made freely available to the scientific community and could be exploited to advance the clinical assessment of these diseases. A multimodal approach using visual deep learning and natural language processing methods was developed to mine out 15,028 light microscopy human rare cancer images. The resulting data set is expected to foster the development of novel clinical research in this field and help researchers to build resources for machine learning.


Keywords:
Conference Type:
published full paper
Faculty:
Economie et Services
School:
HEG-VS
Institute:
Institut Informatique de gestion
Subject(s):
Informatique
Publisher:
Houston, USA, 15-20 February 2020
Date:
2020-02
Houston, USA
15-20 February 2020
Pagination:
11 p.
Published in:
Proceedings of Medical Imaging 2020: Imaging Informatics for Healthcare, Research, and Applications
Series Statement:
Proceedings of SPIE, vol. 11318
DOI:
ISSN:
1605-7422
ISBN:
9781510634039
Appears in Collection:

Note: The status of this file is: restricted


 Record created 2020-11-17, last modified 2020-11-20

Fulltext:
Download fulltext
PDF

Rate this document:

Rate this document:
1
2
3
 
(Not yet reviewed)