Machine learning for automatic encoding of French electronic medical records : is more data better?

Gobeill, Julien (Haute école de gestion de Genève, HES-SO // Haute Ecole Spécialisée de Suisse Occidentale ; SIB Text Mining group, Swiss Institute of Bioinformatics, Geneva, Switzerland) ; Ruch, Patrick (Haute école de gestion de Genève, HES-SO // Haute Ecole Spécialisée de Suisse Occidentale ; SIB Text Mining group, Swiss Institute of Bioinformatics, Geneva, Switzerland) ; Meyer, Rodolphe (University Hospitals of Geneva (HUG), Geneva, Switzerland)

The encoding of Electronic Medical Records is a complex and time-consuming task. We report on a machine learning model for proposing diagnoses and procedures codes, from a large realistic dataset of 245 000 electronic medical records at the University Hospitals of Geneva. Our study particularly focuses on the impact of training data quantity on the model’s performances. We show that the performances of the models do not increase while encoded instances from previous years are exploited for learning data. Furthermore, supervised models are shown to be highly perishable: we show a potential drop in performances of around -10% per year. Consequently, great and constant care must be exercised for designing and updating the content of such knowledge bases exploited by machine learning.


Keywords:
Faculty:
Economie et Services
School:
HEG - Genève
Institute:
CRAG - Centre de Recherche Appliquée en Gestion
Subject(s):
Sciences de l'information
Publisher:
Amsterdam, The Netherlands, OIS Press
Date:
2020-01
Amsterdam, The Netherlands
OIS Press
Pagination:
Pp. 312-316
Published in:
Digital Personalized Health and Medicine
Series Statement:
Studies in Health Technology and Informatics, vol. 270
Author of the book:
Pape-Haugaard, Louise B. ; Aalborg University, Denmark
DOI:
ISSN:
0926-9630
ISBN:
978-1-64368-082-8
Appears in Collection:



 Record created 2020-07-20, last modified 2020-07-24

Fulltext:
Download fulltext
PDF

Rate this document:

Rate this document:
1
2
3
 
(Not yet reviewed)