Sarah McGough, PhD
San Francisco, California, United States
1K followers
500+ connections
Activity
-
I’m thrilled that the Genentech Foundation is renewing its commitment to San Francisco State University to support programs that provide students…
I’m thrilled that the Genentech Foundation is renewing its commitment to San Francisco State University to support programs that provide students…
Shared by Sarah McGough, PhD
-
Our latest research "Zika emergence, persistence, and transmission rate in Colombia: a nationwide application of a space-time Markov switching model"…
Our latest research "Zika emergence, persistence, and transmission rate in Colombia: a nationwide application of a space-time Markov switching model"…
Liked by Sarah McGough, PhD
-
Glad to make the highlights! CogX Festival was a multidisciplinary feast for the minds.
Glad to make the highlights! CogX Festival was a multidisciplinary feast for the minds.
Shared by Sarah McGough, PhD
Experience
Education
-
Harvard University
-
Activities and Societies: Presidential Scholar, Dudley House Public Service Fellow
Dissertation: "Anticipating Outbreaks: Predictive Modeling to Improve Infectious Disease Surveillance"
-
-
-
-
Activities and Societies: Phi Beta Kappa, Amnesty International, International Development Research Council, Social Concerns Committee
Volunteer Experience
-
Mentor- Gene Academy
Genentech
- Present 3 years 8 months
Education
STEM mentor for elementary school students in underserved South San Francisco public schools.
-
Mentor/Coach
Futurelab
- Present 2 years 8 months
Education
High school biotech curriculum coach preparing students for careers in biotechnology and STEM. Designed to reach 2M students by 2026.
-
Mentor- Data Science and Machine Learning for Biotechnology Certificate Program
SFSU PINC
- Present 2 years 8 months
Education
Interview prep for students enrolled in SFSU Data Science and Machine Learning for Biotechnology Certificate under the Promoting Inclusivity in Computing (PINC) program.
Publications
-
Learning from data with structured missingness
Nature Machine Intelligence
Missing data are an unavoidable complication in many machine learning tasks. When data are ‘missing at random’ there exist a range of tools and techniques to deal with the issue. However, as machine learning studies become more ambitious, and seek to learn from ever-larger volumes of heterogeneous data, an increasingly encountered problem arises in which missing values exhibit an association or structure, either explicitly or implicitly. Such ‘structured missingness’ raises a range of…
Missing data are an unavoidable complication in many machine learning tasks. When data are ‘missing at random’ there exist a range of tools and techniques to deal with the issue. However, as machine learning studies become more ambitious, and seek to learn from ever-larger volumes of heterogeneous data, an increasingly encountered problem arises in which missing values exhibit an association or structure, either explicitly or implicitly. Such ‘structured missingness’ raises a range of challenges that have not yet been systematically addressed, and presents a fundamental hindrance to machine learning at scale. Here we outline the current literature and propose a set of grand challenges in learning from data with structured missingness.
-
Penalized regression for left-truncated and right-censored survival data
Statistics in Medicine
High-dimensional data are becoming increasingly common in the medical field as large volumes of patient information are collected and processed by high-throughput screening, electronic health records, and comprehensive genomic testing. Statistical models that attempt to study the effects of many predictors on survival typically implement feature selection or penalized methods to mitigate the undesirable consequences of overfitting. In some cases survival data are also left-truncated which can…
High-dimensional data are becoming increasingly common in the medical field as large volumes of patient information are collected and processed by high-throughput screening, electronic health records, and comprehensive genomic testing. Statistical models that attempt to study the effects of many predictors on survival typically implement feature selection or penalized methods to mitigate the undesirable consequences of overfitting. In some cases survival data are also left-truncated which can give rise to an immortal time bias, but penalized survival methods that adjust for left truncation are not commonly implemented. To address these challenges, we apply a penalized Cox proportional hazards model for left-truncated and right-censored survival data and assess implications of left truncation adjustment on bias and interpretation. We use simulation studies and a high-dimensional, real-world clinico-genomic database to highlight the pitfalls of failing to account for left truncation in survival modeling.
-
A dynamic, ensemble learning approach to forecast dengue fever epidemic years in Brazil using weather and population susceptibility cycles
Journal of the Royal Society Interface
Transmission of dengue fever depends on a complex interplay of human, climate and mosquito dynamics, which often change in time and space. It is well known that its disease dynamics are highly influenced by multiple factors including population susceptibility to infection as well as by microclimates: small-area climatic conditions which create environments favourable for the breeding and survival of mosquitoes. Here, we present a novel machine learning dengue forecasting approach, which…
Transmission of dengue fever depends on a complex interplay of human, climate and mosquito dynamics, which often change in time and space. It is well known that its disease dynamics are highly influenced by multiple factors including population susceptibility to infection as well as by microclimates: small-area climatic conditions which create environments favourable for the breeding and survival of mosquitoes. Here, we present a novel machine learning dengue forecasting approach, which, dynamically in time and space, identifies local patterns in weather and population susceptibility to make epidemic predictions at the city level in Brazil, months ahead of the occurrence of disease outbreaks. Weather-based predictions are improved when information on population susceptibility is incorporated, indicating that immunity is an important predictor neglected by most dengue forecast models. Given the generalizability of our methodology to any location or input data, it may prove valuable for public health decision-making aimed at mitigating the effects of seasonal dengue outbreaks in locations globally.
-
Nowcasting for real-time COVID-19 tracking in New York City: An evaluation using reportable disease data from early in the pandemic
JMIR Public Health and Surveillance
Objective:
To support real-time COVID-19 situational awareness, the New York City Department of Health and Mental Hygiene used nowcasting to account for testing and reporting delays. We conducted an evaluation to determine which implementation details would yield the most accurate estimated case counts.
Methods:
A time-correlated Bayesian approach called Nowcasting by Bayesian Smoothing (NobBS) was applied in real time to line lists of reportable disease surveillance data…Objective:
To support real-time COVID-19 situational awareness, the New York City Department of Health and Mental Hygiene used nowcasting to account for testing and reporting delays. We conducted an evaluation to determine which implementation details would yield the most accurate estimated case counts.
Methods:
A time-correlated Bayesian approach called Nowcasting by Bayesian Smoothing (NobBS) was applied in real time to line lists of reportable disease surveillance data, accounting for the delay from diagnosis to reporting and the shape of the epidemic curve. We retrospectively evaluated nowcasting performance for confirmed case counts among residents diagnosed during the period from March to May 2020, a period when the median reporting delay was 2 days.
Results:
Nowcasts with a 2-week moving window and a negative binomial distribution had lower mean absolute error, lower relative root mean square error, and higher 95% prediction interval coverage than nowcasts conducted with a 3-week moving window or with a Poisson distribution. Nowcasts conducted toward the end of the week outperformed nowcasts performed earlier in the week, given fewer patients diagnosed on weekends and lack of day-of-week adjustments. When estimating case counts for weekdays only, metrics were similar across days when the nowcasts were conducted, with Mondays having the lowest mean absolute error of 183 cases in the context of an average daily weekday case count of 2914.
Conclusions:
Nowcasting using NobBS can effectively support COVID-19 trend monitoring. Accounting for overdispersion, shortening the moving window, and suppressing diagnoses on weekends—when fewer patients submitted specimens for testing—improved the accuracy of estimated case counts. Nowcasting ensured that recent decreases in observed case counts were not overinterpreted as true declines and supported officials in anticipating the magnitude and timing of hospitalizations and deaths and allocating resources geographically. -
Rates of increase of antibiotic resistance and ambient temperature in Europe: a cross-national analysis of 28 countries between 2000 and 2016
Eurosurveillance
Background
The rapid increase of bacterial antibiotic resistance could soon render our most effective method to address infections obsolete. Factors influencing pathogen resistance prevalence in human populations remain poorly described, though temperature is known to contribute to mechanisms of spread.
Aim
To quantify the role of temperature, spatially and temporally, as a mechanistic modulator of transmission of antibiotic resistant microbes.
Methods
An ecologic…Background
The rapid increase of bacterial antibiotic resistance could soon render our most effective method to address infections obsolete. Factors influencing pathogen resistance prevalence in human populations remain poorly described, though temperature is known to contribute to mechanisms of spread.
Aim
To quantify the role of temperature, spatially and temporally, as a mechanistic modulator of transmission of antibiotic resistant microbes.
Methods
An ecologic analysis was performed on country-level antibiotic resistance prevalence in three common bacterial pathogens across 28 European countries, collectively representing over 4 million tested isolates. Associations of minimum temperature and other predictors with change in antibiotic resistance rates over 17 years (2000–2016) were evaluated with multivariable models. The effects of predictors on the antibiotic resistance rate change across geographies were quantified.
Results
During 2000–2016, for Escherichia coli and Klebsiella pneumoniae, European countries with 10°C warmer ambient minimum temperatures compared to others, experienced more rapid resistance increases across all antibiotic classes. Increases ranged between 0.33%/year (95% CI: 0.2 to 0.5) and 1.2%/year (95% CI: 0.4 to 1.9), even after accounting for recognised resistance drivers including antibiotic consumption and population density. For Staphylococcus aureus a decreasing relationship of −0.4%/year (95% CI: −0.7 to 0.0) was found for meticillin resistance, reflecting widespread declines in meticillin-resistant S. aureus across Europe over the study period.
Conclusion
We found evidence of a long-term effect of ambient minimum temperature on antibiotic resistance rate increases in Europe. Ambient temperature might considerably influence antibiotic resistance growth rates, and explain geographic differences observed in cross-sectional studies. Rising temperatures globally may hasten resistance spread, complicating mitigation efforts. -
Modeling COVID-19 mortality in the US: Community context and mobility matter
medrxiv
The United States has become an epicenter for the coronavirus disease 2019 (COVID-19) pandemic. However, communities have been unequally affected and evidence is growing that social determinants of health may be exacerbating the pandemic. Furthermore, the impact and timing of social distancing at the community level have yet to be fully explored. We investigated the relative associations between COVID-19 mortality and social distancing, sociodemographic makeup, economic vulnerabilities, and…
The United States has become an epicenter for the coronavirus disease 2019 (COVID-19) pandemic. However, communities have been unequally affected and evidence is growing that social determinants of health may be exacerbating the pandemic. Furthermore, the impact and timing of social distancing at the community level have yet to be fully explored. We investigated the relative associations between COVID-19 mortality and social distancing, sociodemographic makeup, economic vulnerabilities, and comorbidities in 24 counties surrounding 7 major metropolitan areas in the US using a flexible and robust time series modeling approach. We found that counties with poorer health and less wealth were associated with higher daily mortality rates compared to counties with fewer economic vulnerabilities and fewer pre-existing health conditions. Declines in mobility were associated with up to 15% lower mortality rates relative to pre-social distancing levels of mobility, but effects were lagged between 25-30 days. While we cannot estimate causal impact, this study provides insight into the association of social distancing on community mortality while accounting for key community factors. For full transparency and reproducibility, we provide all data and code used in this study.
-
Nowcasting by Bayesian Smoothing: A flexible, generalizable model for real-time epidemic tracking
PLOS Computational Biology
Achieving accurate, real-time estimates of disease activity ('nowcasts') is challenged by delays in case reporting. However, approaches that seek to estimate cases in spite of reporting delays often do not consider the temporal relationship between cases during an outbreak, and may not generalize to surveillance contexts with very different reporting delays. This study describes a smooth Bayesian nowcasting approach that produces accurate estimates that capture the time evolution of the…
Achieving accurate, real-time estimates of disease activity ('nowcasts') is challenged by delays in case reporting. However, approaches that seek to estimate cases in spite of reporting delays often do not consider the temporal relationship between cases during an outbreak, and may not generalize to surveillance contexts with very different reporting delays. This study describes a smooth Bayesian nowcasting approach that produces accurate estimates that capture the time evolution of the epidemic curve. We assess the performance for two diseases and show that relating cases between sequential time points contributes to NobBS’s performance and robustness across surveillance settings.
-
NobBS: Nowcasting by Bayesian Smoothing (R package)
CRAN: Comprehensive R Archive Network (R programming language)
A Bayesian approach to estimate the number of occurred-but-not-yet-reported cases from incomplete, time-stamped reporting data for disease outbreaks. 'NobBS' learns the reporting delay distribution and the time evolution of the epidemic curve to produce smoothed nowcasts in both stable and time-varying case reporting settings, as described in McGough et al. (2019) <doi:10.1101/663823>.
-
Antibiotic resistance increases with local temperature
Nature Climate Change
We explored the role of climate (local minimum temperature) and additional factors on the distribution of antibiotic resistance across the United States, and show that increasing local temperature as well as population density are associated with increasing antibiotic resistance (percent resistant) in common pathogens. We found that an increase in temperature of 10 °C across regions was associated with an increases in antibiotic resistance of 4.2%, 2.2%, and 2.7% for the common pathogens…
We explored the role of climate (local minimum temperature) and additional factors on the distribution of antibiotic resistance across the United States, and show that increasing local temperature as well as population density are associated with increasing antibiotic resistance (percent resistant) in common pathogens. We found that an increase in temperature of 10 °C across regions was associated with an increases in antibiotic resistance of 4.2%, 2.2%, and 2.7% for the common pathogens Escherichia coli, Klebsiella pneumoniae and Staphylococcus aureus. The associations between temperature and antibiotic resistance in this ecological study are consistent across most classes of antibiotics and pathogens and may be strengthening over time. These findings suggest that current forecasts of the burden of antibiotic resistance could be significant underestimates in the face of a growing population and climate change.
-
Forecasting Zika Incidence in the 2016 Latin America Outbreak Combining Traditional Disease Surveillance with Search, Social Media, and News Report Data
PLoS Neglected Tropical Diseases
In the absence of access to real-time government-reported Zika case counts, we demonstrate the ability of Internet-based data sources to track the outbreak. We combined information from Zika-related Google searches, Twitter microblogs, and the HealthMap digital surveillance system with historical Zika suspected case counts to track and predict estimates of suspected weekly Zika cases during the 2015–2016 Latin American outbreak, up to three weeks ahead of the publication of official case data…
In the absence of access to real-time government-reported Zika case counts, we demonstrate the ability of Internet-based data sources to track the outbreak. We combined information from Zika-related Google searches, Twitter microblogs, and the HealthMap digital surveillance system with historical Zika suspected case counts to track and predict estimates of suspected weekly Zika cases during the 2015–2016 Latin American outbreak, up to three weeks ahead of the publication of official case data. Given the significant delay in the release of official government-reported Zika case counts, we show that these Internet-based data streams can be used as timely and complementary ways to assess the dynamics of the outbreak.
Other authors
Honors & Awards
-
Harvard University Distinction in Teaching Award
Harvard University
Acknowledges a special contribution to the teaching of undergraduates in Harvard College.
-
Presidential Scholar
Harvard University
-
Defeating Malaria: From the Genes to the Globe Student Fellowship
Harvard University
Grant receipient for global malaria research
-
Michael Anderson Award for Academic Excellence and Social Responsibility
Glynn Family Honors Program, University of Notre Dame
Awarded to the graduating honors senior who has demonstrated high academic achievement, commitment to social justice, and vision for global change.
-
Phi Beta Kappa
Epsilon of Indiana
Most distinguished university academic honor society for the liberal arts and sciences.
Invited for membership by Phi Beta Kappa faculty panel as 1 of 100 seniors selected from Notre Dame's College of Arts and Letters and College of Science. Based on top academic distinction. -
Raymond W. Murray, C.S.C. Award in Anthropology
University of Notre Dame Department of Anthropology
Given to the top graduating senior for exemplary work in anthropology.
-
George Monteiro Prize
Kellogg Institute for International Studies, University of Notre Dame
Awarded best paper written in the Portuguese language, for the essay "Espiritismo e etnobotânica do Candomblé no desenvolvimento da identidade dos escravos afro-brasileiros" (“The spiritualism and ethnobotany of Candomblé in the development of the afro-Brazilian slave identity”)
-
Glynn Family Honors Program Undergraduate Research Grant
University of Notre Dame
Awarded funds to conduct research in Brazil for summer 2012.
-
Hesburgh-Yusko Scholars Program
University of Notre Dame
$100,000 merit-based scholarship offered by the University of Notre Dame in recognition of “distinguished academic accomplishment, exemplary moral character, demonstrated leadership ability, and a sincere commitment to service.” Selected as 1 of 25 scholars in the inaugural class of 2014 from a pool of more than 400 applicants.
Languages
-
English
Native or bilingual proficiency
-
Portuguese
Professional working proficiency
-
Spanish
Professional working proficiency
More activity by Sarah
-
An honor to get up with these greats on stage at CogX Festival and our panel on AI in Precision Medicine. Really enjoyed offering the biotech…
An honor to get up with these greats on stage at CogX Festival and our panel on AI in Precision Medicine. Really enjoyed offering the biotech…
Shared by Sarah McGough, PhD
Other similar profiles
Explore collaborative articles
We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
Explore More