Unraveling COVID-19: A Large-Scale Characterization of 4.5 Million COVID-19 Cases Using CHARYBDIS.
Kostka K., Duarte-Salles T., Prats-Uribe A., Sena AG., Pistillo A., Khalid S., Lai LYH., Golozar A., Alshammari TM., Dawoud DM., Nyberg F., Wilcox AB., Andryc A., Williams A., Ostropolets A., Areia C., Jung CY., Harle CA., Reich CG., Blacketer C., Morales DR., Dorr DA., Burn E., Roel E., Tan EH., Minty E., DeFalco F., de Maeztu G., Lipori G., Alghoul H., Zhu H., Thomas JA., Bian J., Park J., Martínez Roldán J., Posada JD., Banda JM., Horcajada JP., Kohler J., Shah K., Natarajan K., Lynch KE., Liu L., Schilling LM., Recalde M., Spotnitz M., Gong M., Matheny ME., Valveny N., Weiskopf NG., Shah N., Alser O., Casajust P., Park RW., Schuff R., Seager S., DuVall SL., You SC., Song S., Fernández-Bertolín S., Fortin S., Magoc T., Falconer T., Subbian V., Huser V., Ahmed W-U-R., Carter W., Guan Y., Galvan Y., He X., Rijnbeek PR., Hripcsak G., Ryan PB., Suchard MA., Prieto-Alhambra D.
PURPOSE: Routinely collected real world data (RWD) have great utility in aiding the novel coronavirus disease (COVID-19) pandemic response. Here we present the international Observational Health Data Sciences and Informatics (OHDSI) Characterizing Health Associated Risks and Your Baseline Disease In SARS-COV-2 (CHARYBDIS) framework for standardisation and analysis of COVID-19 RWD. PATIENTS AND METHODS: We conducted a descriptive retrospective database study using a federated network of data partners in the United States, Europe (the Netherlands, Spain, the UK, Germany, France and Italy) and Asia (South Korea and China). The study protocol and analytical package were released on 11th June 2020 and are iteratively updated via GitHub. We identified three non-mutually exclusive cohorts of 4,537,153 individuals with a clinical COVID-19 diagnosis or positive test, 886,193 hospitalized with COVID-19, and 113,627 hospitalized with COVID-19 requiring intensive services. RESULTS: We aggregated over 22,000 unique characteristics describing patients with COVID-19. All comorbidities, symptoms, medications, and outcomes are described by cohort in aggregate counts and are readily available online. Globally, we observed similarities in the USA and Europe: more women diagnosed than men but more men hospitalized than women, most diagnosed cases between 25 and 60 years of age versus most hospitalized cases between 60 and 80 years of age. South Korea differed with more women than men hospitalized. Common comorbidities included type 2 diabetes, hypertension, chronic kidney disease and heart disease. Common presenting symptoms were dyspnea, cough and fever. Symptom data availability was more common in hospitalized cohorts than diagnosed. CONCLUSION: We constructed a global, multi-centre view to describe trends in COVID-19 progression, management and evolution over time. By characterising baseline variability in patients and geography, our work provides critical context that may otherwise be misconstrued as data quality issues. This is important as we perform studies on adverse events of special interest in COVID-19 vaccine surveillance.