BiTeM at CLEF eHealth Evaluation Lab 2016 Task 2 : Multilingual Information Extraction

Mottin, Luc (Haute école de gestion de Genève, HES-SO // Haute Ecole Spécialisée de Suisse Occidentale) ; Gobeill, Julien (Haute école de gestion de Genève, HES-SO // Haute Ecole Spécialisée de Suisse Occidentale) ; Mottaz, Anaïs (Haute école de gestion de Genève, HES-SO // Haute Ecole Spécialisée de Suisse Occidentale) ; Gaudinat, Arnaud (Haute école de gestion de Genève, HES-SO // Haute Ecole Spécialisée de Suisse Occidentale) ; Ruch, Patrick (Haute école de gestion de Genève, HES-SO // Haute Ecole Spécialisée de Suisse Occidentale)

BiTeM/SIB Text Mining (http://bitem.hesge.ch/) is a University re-search group carrying over activities in semantic and text analytics applied to health and life sciences. This paper reports on the participation of our team at the CLEF eHealth 2016 evaluation lab. The processing applied to each evaluation corpus (QUAREO and CépiDC) was originally very similar. Our method is based on an Au-tomatic Text Categorization (ATC) system. First, the system is set with a specific input ontology (French UMLS), and ATC assigns a rank list of related concepts to each document received in input. Then, a second module relocates all of the positive matches in the text, and normalizes the extracted entities. For the CépiDC corpus, the system was loaded with the Swiss ICD-10 GM thesaurus. However a late minute data transformation issue forced us to implement an ad hoc solution based on simple pat-tern matching to comply with the constraints of the CépiDC challenge. We obtained an average precision of 62% on the QUAREO entity extraction (over MEDLINE/EMEA texts, and exact/inexact), 48% on normalizing this entities, and 59% on the CépiDC subtask. Enhancing the recall by expanding the coverage of the terminologies could be an interesting approach to improve this system at moderate labour costs.


Mots-clés:
Type de conférence:
full paper
Faculté:
Economie et Services
Ecole:
HEG GE Haute école de gestion de Genève
Institut:
CRAG - Centre de Recherche Appliquée en Gestion
Classification:
Sciences de l’information
Adresse bibliogr.:
Evora, Portugal, 5 - 8 September, 2016, Evora, Portugal, 5 - 8 September, 2016
Date:
Evora, Portugal, 5 - 8 September, 2016
Evora, Portugal, 5 - 8 September, 2016
2016
Pagination:
9 p.
Publié dans
CEURS Workshop Proceedings, vol. 1609 - Working Notes of CLEF 2016 - Conference and Labs of the Evaluation forum
Numérotation (vol. no.):
2016, Vol. 1609, pp. 94-102
Ressource(s) externe(s):
Le document apparaît dans:



 Notice créée le 2016-10-03, modifiée le 2018-08-31

Fichiers:
Télécharger le document
PDF

Évaluer ce document:

Rate this document:
1
2
3
 
(Pas encore évalué)