ORBDA: an openEHR benchmark dataset for performance assessment of electronic health record servers

Teodoro, Douglas (Universidade do Estado do Rio de Janeiro, Brazil ; Haute école de gestion de Genève, HES-SO // Haute Ecole Spécialisée de Suisse Occidentale) ; Sundvall, Erik (Linköping University, Sweden) ; João Junior, Mario (Universidade do Estado do Rio de Janeiro, Brazil) ; Ruch, Patrick (Haute école de gestion de Genève, HES-SO // Haute Ecole Spécialisée de Suisse Occidentale) ; Miranda Freire, Sergio (Universidade do Estado do Rio de Janeiro, Brazil)

The openEHR specifications are designed to support implementation of flexible and interoperable Electronic Health Record (EHR) systems. Despite the increasing number of solutions based on the openEHR specifications, it is difficult to find publicly available healthcare datasets in the openEHR format that can be used to test, compare and validate different data persistence mechanisms for openEHR. To foster research on openEHR servers, we present the openEHR Benchmark Dataset, ORBDA, a very large healthcare benchmark dataset encoded using the openEHR formalism. To construct ORBDA, we extracted and cleaned a de-identified dataset from the Brazilian National Healthcare System (SUS) containing hospitalisation and high complexity procedures information and formalised it using a set of open-EHR archetypes and templates. Then, we implemented a tool to enrich the raw relational data and convert it into the openEHR model using the openEHR Java reference model library. The ORBDA dataset is available in composition, versioned composition and EHR openEHR representations in XML and JSON formats. In total, the dataset contains more than 150 million composition records. We describe the dataset and provide means to access it. Additionally, we demonstrate the usage of ORBDA for evaluating inserting throughput and query latency performances of some NoSQL database management systems. We believe that ORBDA is a valuable asset for assessing storage models for openEHR-based information systems during the software engineering process. It may also be a suitable component in future standardised benchmarking of available openEHR storage platforms.


Type d'article:
scientifique
Faculté:
Economie et Services
Ecole:
HEG GE Haute école de gestion de Genève
Institut:
CRAG - CRAG - Centre de Recherche Appliquée en Gestion
Classification:
Sciences de l'information
Date:
2018
Pagination:
22 p.
Titre du document hôte:
PLOS ONE
DOI:
Le document apparaît dans:



 Notice créée le 2018-03-20, modifiée le 2018-04-09

Fichiers:
Télécharger le document
PDF

Évaluer ce document:

Rate this document:
1
2
3
 
(Pas encore évalué)