Speaker
Description
Douala General Hospital, a first-class healthcare facility in Cameroon, serves thousands of patients yearly through its multidisciplinary medical teams. The hospital hosts numerous patient records that hold significant potential for public health research. However, most records remain paper-based, limiting their accessibility and reuse. In departments such as pulmonology, patient data are often stored in heterogeneous data sheets lacking uniform structure or standardisation, which constrains their use for clinical research, care management, and evidence-based decision-making. Moreover, the absence of standardisation hinders data integration within broader health systems, restricting secure sharing and interoperability.
To address these challenges, we implemented a complete Extract, Transform, and Load (ETL) pipeline aligned with the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) version 5.4, an internationally recognised framework for health data standardisation. The objective was to transform and integrate patient data from the tuberculosis department into a database compliant with FAIR (Findable, Accessible, Interoperable, Reusable) principles, thereby enhancing data quality, interoperability, and reusability for research and clinical monitoring.
The dataset included over 80 clinical and administrative variables such as sociodemographic data, medical history, symptoms, laboratory results, and diagnoses. These data were extracted from varied paper-based sources, presenting differences in completeness and structure. For standardisation, several Observational Health Data Sciences and Informatics (OHDSI) tools were employed: WhiteRabbit for data profiling, USAGI for vocabulary mapping, and Rabbit-in-a-Hat for defining table mappings to the OMOP CDM structure. The populated tables included Person, care_site, Measurement, Visit_Occurrence, Condition_Occurrence, Observation_period and Observation. Concept mappings were derived from SNOMED CT, LOINC, and RxNorm, with contextual adaptations to local data.
The ETL pipeline was developed using SQL scripts generated from Rabbit-in-a-Hat and executed in pgAdmin for PostgreSQL. The OMOP tables were created using scripts from the OHDSI GitHub repository, and the transformed data were loaded accordingly.
Data quality was evaluated using the Achilles tool, which automatically assessed completeness, conformance, and plausibility, achieving an overall score of 99%, demonstrating the reliability of the pipeline. This work represents a pioneering effort in applying OMOP CDM within the African context, promoting collaboration, interoperability, and data-driven decision-making to strengthen tuberculosis care and research in Cameroon.
53573500355