Understanding pediatric long COVID using a tree-based scan statistic approach: an EHR-based cohort study from the RECOVER Program

JAMIA Open. 2023 Mar 14;6(1):ooad016. doi: 10.1093/jamiaopen/ooad016. eCollection 2023 Apr.

Abstract

Objectives: Post-acute sequalae of SARS-CoV-2 infection (PASC) is not well defined in pediatrics given its heterogeneity of presentation and severity in this population. The aim of this study is to use novel methods that rely on data mining approaches rather than clinical experience to detect conditions and symptoms associated with pediatric PASC.

Materials and methods: We used a propensity-matched cohort design comparing children identified using the new PASC ICD10CM diagnosis code (U09.9) (N = 1309) to children with (N = 6545) and without (N = 6545) SARS-CoV-2 infection. We used a tree-based scan statistic to identify potential condition clusters co-occurring more frequently in cases than controls.

Results: We found significant enrichment among children with PASC in cardiac, respiratory, neurologic, psychological, endocrine, gastrointestinal, and musculoskeletal systems, the most significant related to circulatory and respiratory such as dyspnea, difficulty breathing, and fatigue and malaise.

Discussion: Our study addresses methodological limitations of prior studies that rely on prespecified clusters of potential PASC-associated diagnoses driven by clinician experience. Future studies are needed to identify patterns of diagnoses and their associations to derive clinical phenotypes.

Conclusion: We identified multiple conditions and body systems associated with pediatric PASC. Because we rely on a data-driven approach, several new or under-reported conditions and symptoms were detected that warrant further investigation.

Keywords: COVID-19; long COVID; post-acute sequelae of SARS-CoV-2 infection.