Files
Abstract
This paper presents an overview of the Medical Visual Question
Answering task (VQA-Med) at ImageCLEF 2019. Participating systems
were tasked with answering medical questions based on the visual
content of radiology images. In this second edition of VQA-Med, we focused
on four categories of clinical questions: Modality, Plane, Organ
System, and Abnormality. These categories are designed with different
degrees of dificulty leveraging both classification and text generation
approaches. We also ensured that all questions can be answered from
the image content without requiring additional medical knowledge or
domain-specific inference. We created a new dataset of 4,200 radiology
images and 15,292 question-answer pairs following these guidelines. The
challenge was well received with 17 participating teams who applied a
wide range of approaches such as transfer learning, multi-task learning,
and ensemble methods. The best team achieved a BLEU score of 64.4%
and an accuracy of 62.4%. In future editions, we will consider designing
more goal-oriented datasets and tackling new aspects such as contextual
information and domain-specific inference.