BiTeM at CLEF eHealth Evaluation Lab 2016 Task 2 : Multilingual Information Extraction

Mottin, Luc (Haute école de gestion de Genève, HES-SO // Haute Ecole Spécialisée de Suisse Occidentale) ; Gobeill, Julien (Haute école de gestion de Genève, HES-SO // Haute Ecole Spécialisée de Suisse Occidentale) ; Mottaz, Anaïs (Haute école de gestion de Genève, HES-SO // Haute Ecole Spécialisée de Suisse Occidentale) ; Gaudinat, Arnaud (Haute école de gestion de Genève, HES-SO // Haute Ecole Spécialisée de Suisse Occidentale) ; Ruch, Patrick (Haute école de gestion de Genève, HES-SO // Haute Ecole Spécialisée de Suisse Occidentale)

BiTeM/SIB Text Mining (http://bitem.hesge.ch/) is a University re-search group carrying over activities in semantic and text analytics applied to health and life sciences. This paper reports on the participation of our team at the CLEF eHealth 2016 evaluation lab. The processing applied to each evaluation corpus (QUAREO and CépiDC) was originally very similar. Our method is based on an Au-tomatic Text Categorization (ATC) system. First, the system is set with a specific input ontology (French UMLS), and ATC assigns a rank list of related concepts to each document received in input. Then, a second module relocates all of the positive matches in the text, and normalizes the extracted entities. For the CépiDC corpus, the system was loaded with the Swiss ICD-10 GM thesaurus. However a late minute data transformation issue forced us to implement an ad hoc solution based on simple pat-tern matching to comply with the constraints of the CépiDC challenge. We obtained an average precision of 62% on the QUAREO entity extraction (over MEDLINE/EMEA texts, and exact/inexact), 48% on normalizing this entities, and 59% on the CépiDC subtask. Enhancing the recall by expanding the coverage of the terminologies could be an interesting approach to improve this system at moderate labour costs.


Keywords:
Conference Type:
full paper
Faculty:
Economie et Services
School:
HEG - Genève
Institute:
CRAG - Centre de Recherche Appliquée en Gestion
Subject(s):
Sciences de l'information
Publisher:
Evora, Portugal, 5 - 8 September, 2016, Evora, Portugal, 5 - 8 September, 2016
Date:
Evora, Portugal, 5 - 8 September, 2016
Evora, Portugal, 5 - 8 September, 2016
2016
Pagination:
9 p.
Published in:
CEURS Workshop Proceedings, vol. 1609 - Working Notes of CLEF 2016 - Conference and Labs of the Evaluation forum
Numeration (vol. no.):
2016, Vol. 1609, pp. 94-102
External resources:
Appears in Collection:



 Record created 2016-10-03, last modified 2019-06-11

Fulltext:
Download fulltext
PDF

Rate this document:

Rate this document:
1
2
3
 
(Not yet reviewed)