BiTeM at WNUT 2020 shared task-1 : named entity recognition over wet lab protocols using an ensemble of contextual language models

Knafou, Julien (Haute école de gestion de Genève, HES-SO // Haute Ecole Spécialisée de Suisse Occidentale ; Swiss Institute of Bioinformatics, Geneva, Switzerland ; University of Geneva, Switzerland) ; Naderi, Nona (Haute école de gestion de Genève, HES-SO // Haute Ecole Spécialisée de Suisse Occidentale ; Swiss Institute of Bioinformatics, Geneva, Switzerland) ; Copara, Jenny (Haute école de gestion de Genève, HES-SO // Haute Ecole Spécialisée de Suisse Occidentale ; Swiss Institute of Bioinformatics, Geneva, Switzerland ; University of Geneva, Switzerland) ; Teodoro, Douglas (Haute école de gestion de Genève, HES-SO // Haute Ecole Spécialisée de Suisse Occidentale ; Swiss Institute of Bioinformatics, Geneva, Switzerland) ; Ruch, Patrick (Haute école de gestion de Genève, HES-SO // Haute Ecole Spécialisée de Suisse Occidentale ; Swiss Institute of Bioinformatics, Geneva, Switzerland)

Recent improvements in machine-reading technologies attracted much attention to automation problems and their possibilities. In this context, WNUT 2020 introduces a Name Entity Recognition (NER) task based on wet laboratory procedures. In this paper, we present a 3-step method based on deep neural language models that reported the best overall exact match F1-score (77.99%) of the competition. By fine-tuning 10 times, 10 different pretrained language models, this work shows the advantage of having more models in an ensemble based on a majority of votes strategy. On top of that, having 100 different models allowed us to analyse the combinations of ensemble that demonstrated the impact of having multiple pretrained models versus fine-tuning a pretrained model multiple times.


Conference Type:
published full paper
Faculty:
Economie et Services
School:
HEG - Genève
Institute:
CRAG - Centre de Recherche Appliquée en Gestion
Subject(s):
Informatique
Publisher:
Virtual conference, 19 November 2020
Date:
2020-11
Virtual conference
19 November 2020
Pagination:
Pp. 305-313
Published in:
Proceedings of the Sixth Workshop on Noisy User-generated Text (W-NUT 2020)
DOI:
Appears in Collection:



 Record created 2021-01-19, last modified 2021-02-05

Fulltext:
Download fulltext
PDF

Rate this document:

Rate this document:
1
2
3
 
(Not yet reviewed)