Afrah Shafquat

Afrah Shafquat

New York City Metropolitan Area
2K followers 500+ connections

About

I am passionate about developing methods and models to get insights from high dimensional…

Experience

Education

Publications

  • Identifying novel associations in GWAS by hierarchical Bayesian latent variable detection of differentially misclassified phenotypes

    BMC Bioinformatics

    Heterogeneity in definition and measurement of complex diseases in Genome-Wide Association Studies (GWAS) may lead to misdiagnoses and misclassification errors that can significantly impact discovery of disease loci. While well appreciated, almost all analyses of GWAS data consider reported disease phenotype values as is without accounting for potential misclassification. Here, we introduce Phenotype Latent variable Extraction of disease misdiagnosis (PheLEx), a GWAS analysis framework that…

    Heterogeneity in definition and measurement of complex diseases in Genome-Wide Association Studies (GWAS) may lead to misdiagnoses and misclassification errors that can significantly impact discovery of disease loci. While well appreciated, almost all analyses of GWAS data consider reported disease phenotype values as is without accounting for potential misclassification. Here, we introduce Phenotype Latent variable Extraction of disease misdiagnosis (PheLEx), a GWAS analysis framework that learns and corrects misclassified phenotypes using structured genotype associations within a dataset. PheLEx consists of a hierarchical Bayesian latent variable model, where inference of differential misclassification is accomplished using filtered genotypes while implementing a full mixed model to account for population structure and genetic relatedness in study populations. Through simulations, we show that the PheLEx framework dramatically improves recovery of the correct disease state when considering realistic allele effect sizes compared to existing methodologies designed for Bayesian recovery of disease phenotypes. We also demonstrate the potential of PheLEx for extracting new candidate loci from existing GWAS data by analyzing epilepsy and bipolar disorder phenotypes available from the UK Biobank dataset, where we identify new candidate disease loci not previously reported for these datasets that have biological connections to the disease phenotypes and/or were identified in independent GWAS. In the discussion, we consider both the broader consequences and importance of careful interpretation of misclassification correction in GWAS phenotypes, as well as potential of PheLEx for re-analyzing existing GWAS data to make novel discoveries.

    See publication
  • XX Disorder of Sex Development is Associated with an Insertion on Chromosome 9 and Downregulation of RSPO1 in Dogs (Canis lupus familiaris)

    PLOS One

    Remarkable progress has been achieved in understanding the mechanisms controlling sex determination, yet the cause for many Disorders of Sex Development (DSD) remains unknown. Of particular interest is a rare XX DSD subtype in which individuals are negative for SRY, the testis determining factor on the Y chromosome, yet develop testes or ovotestes, and both of these phenotypes occur in the same family. This is a naturally occurring disorder in humans (Homo sapiens) and dogs (C. familiaris)…

    Remarkable progress has been achieved in understanding the mechanisms controlling sex determination, yet the cause for many Disorders of Sex Development (DSD) remains unknown. Of particular interest is a rare XX DSD subtype in which individuals are negative for SRY, the testis determining factor on the Y chromosome, yet develop testes or ovotestes, and both of these phenotypes occur in the same family. This is a naturally occurring disorder in humans (Homo sapiens) and dogs (C. familiaris). Phenotypes in the canine XX DSD model are strikingly similar to those of the human XX DSD subtype. The purposes of this study were to identify 1) a variant associated with XX DSD in the canine model and 2) gene expression alterations in canine embryonic gonads that could be informative to causation. Using a genome wide association study (GWAS) and whole genome sequencing (WGS), we identified a variant on C. familiaris autosome 9 (CFA9) that is associated with XX DSD in the canine model and in affected purebred dogs. This is the first marker identified for inherited canine XX DSD. It lies upstream of SOX9 within the canine ortholog for the human disorder, which resides on 17q24. Inheritance of this variant indicates that XX DSD is a complex trait in which breed genetic background affects penetrance. Furthermore, the homozygous variant genotype is associated with embryonic lethality in at least one breed. Our analysis of gene expression studies (RNA-seq and PRO-seq) in embryonic gonads at risk of XX DSD from the canine model identified significant RSPO1 downregulation in comparison to XX controls, without significant upregulation of SOX9 or other known testis pathway genes. Based on these data, a novel mechanism is proposed in which molecular lesions acting upstream of RSPO1 induce epigenomic gonadal mosaicism.

    Other authors
    • Vicki N. Meyers-Wallen
    • Adam Boyko
    • Charles Danko
    • Jason Mezey
    • Jessica J. Hayward
    • Laura Shannon
    • Chuan Gao
    • Edward Rice
    • Thomas Ohnesorg
    • Andrew H. Sinclair
    See publication
  • Urban Transit System Microbial Communities Differ by Surface Type and Interaction with Humans and the Environment

    mSystems

    Public transit systems are ideal for studying the urban microbiome and interindividual community transfer. In this study, we used 16S amplicon and shotgun metagenomic sequencing to profile microbial communities on multiple transit surfaces across train lines and stations in the Boston metropolitan transit system. The greatest determinant of microbial community structure was the transit surface type. In contrast, little variation was observed between geographically distinct train lines and…

    Public transit systems are ideal for studying the urban microbiome and interindividual community transfer. In this study, we used 16S amplicon and shotgun metagenomic sequencing to profile microbial communities on multiple transit surfaces across train lines and stations in the Boston metropolitan transit system. The greatest determinant of microbial community structure was the transit surface type. In contrast, little variation was observed between geographically distinct train lines and stations serving different demographics. All surfaces were dominated by human skin and oral commensals such as Propionibacterium, Corynebacterium, Staphylococcus, and Streptococcus. The detected taxa not associated with humans included generalists from alphaproteobacteria, which were especially abundant on outdoor touchscreens. Shotgun metagenomics further identified viral and eukaryotic microbes, including Propionibacterium phage and Malassezia globosa. Functional profiling showed that Propionibacterium acnes pathways such as propionate production and porphyrin synthesis were enriched on train holding surfaces (holds), while electron transport chain components for aerobic respiration were enriched on touchscreens and seats. Lastly, the transit environment was not found to be a reservoir of antimicrobial resistance and virulence genes. Our results suggest that microbial communities on transit surfaces are maintained from a metapopulation of human skin commensals and environmental generalists, with enrichments corresponding to local interactions with the human body and environmental exposures.

    Other authors
    See publication
  • Sequencing and beyond: integrating molecular 'omics' for microbial community profiling

    Nature Reviews Microbiology

    High-throughput DNA sequencing has proven invaluable for investigating diverse environmental and host-associated microbial communities. In this Review, we discuss emerging strategies for microbial community analysis that complement and expand traditional metagenomic profiling. These include novel DNA sequencing strategies for identifying strain-level microbial variation and community temporal dynamics; measuring multiple 'omic' data types that better capture community functional activity, such…

    High-throughput DNA sequencing has proven invaluable for investigating diverse environmental and host-associated microbial communities. In this Review, we discuss emerging strategies for microbial community analysis that complement and expand traditional metagenomic profiling. These include novel DNA sequencing strategies for identifying strain-level microbial variation and community temporal dynamics; measuring multiple 'omic' data types that better capture community functional activity, such as transcriptomics, proteomics and metabolomics; and combining multiple forms of omic data in an integrated framework. We highlight studies in which the 'multi-omics' approach has led to improved mechanistic models of microbial community structure and function.

    Other authors
    See publication
  • A reproducible approach to high-throughput biological data acquisition and integration

    PeerJ

    Modern biological research requires rapid, complex, and reproducible integration of multiple experimental results generated both internally and externally (e.g., from public repositories). Although large systematic meta-analyses are among the most effective approaches both for clinical biomarker discovery and for computational inference of biomolecular mechanisms, identifying, acquiring, and integrating relevant experimental results from multiple sources for a given study can be time-consuming…

    Modern biological research requires rapid, complex, and reproducible integration of multiple experimental results generated both internally and externally (e.g., from public repositories). Although large systematic meta-analyses are among the most effective approaches both for clinical biomarker discovery and for computational inference of biomolecular mechanisms, identifying, acquiring, and integrating relevant experimental results from multiple sources for a given study can be time-consuming and error-prone. To enable efficient and reproducible integration of diverse experimental results, we developed a novel approach for standardized acquisition and analysis of high-throughput and heterogeneous biological data. This allowed, first, novel biomolecular network reconstruction in human prostate cancer, which correctly recovered and extended the NFκB signaling pathway. Next, we investigated host-microbiome interactions. In less than an hour of analysis time, the system retrieved data and integrated six germ-free murine intestinal gene expression datasets to identify the genes most influenced by the gut microbiota, which comprised a set of immune-response and carbohydrate metabolism processes. Finally, we constructed integrated functional interaction networks to compare connectivity of peptide secretion pathways in the model organisms Escherichia coli, Bacillus subtilis, and Pseudomonas aeruginosa.

    Other authors
    See publication
  • Determining microbial products and identifying molecular targets in the human microbiome

    Cell Metabolism

    Human-associated microbes are the source of many bioactive microbial products (proteins and metabolites) that play key functions both in human host pathways and in microbe-microbe interactions. Culture-independent studies now provide an accelerated means of exploring novel bioactives in the human microbiome; however, intriguingly, a substantial fraction of the microbial metagenome cannot be mapped to annotated genes or isolate genomes and is thus of unknown function. Meta’omic approaches…

    Human-associated microbes are the source of many bioactive microbial products (proteins and metabolites) that play key functions both in human host pathways and in microbe-microbe interactions. Culture-independent studies now provide an accelerated means of exploring novel bioactives in the human microbiome; however, intriguingly, a substantial fraction of the microbial metagenome cannot be mapped to annotated genes or isolate genomes and is thus of unknown function. Meta’omic approaches, including metagenomic sequencing, metatranscriptomics, metabolomics, and integration of multiple assay types, represent an opportunity to efficiently explore this large pool of potential therapeutics. In combination with appropriate follow-up validation, high-throughput culture-independent assays can be combined with computational approaches to identify and characterize novel and biologically interesting microbial products. Here we briefly review the state of microbial product identification and characterization and discuss possible next steps to catalog and leverage the large uncharted fraction of the microbial metagenome.

    Other authors
    See publication
  • Functional and phylogenetic assembly of microbial communities in the human microbiome

    Trends in Microbiology (Vol 22, Issue 5 p261-266)

    Microbial communities associated with the human body, that is, the human microbiome, are complex ecologies critical for normal development and health. The taxonomic and phylogenetic composition of these communities tends to significantly differ among individuals, precluding the definition of a simple, shared set of 'core' microbes. Here, we review recent evidence and ecological theory supporting the assembly of host-associated microbial communities in terms of functional traits rather than…

    Microbial communities associated with the human body, that is, the human microbiome, are complex ecologies critical for normal development and health. The taxonomic and phylogenetic composition of these communities tends to significantly differ among individuals, precluding the definition of a simple, shared set of 'core' microbes. Here, we review recent evidence and ecological theory supporting the assembly of host-associated microbial communities in terms of functional traits rather than specific organisms. That is, distinct microbial species may be responsible for specific host-associated functions and phenotypes in distinct hosts. We discuss how ecological processes (selective and stochastic forces) governing the assembly of metazoan communities can be adapted to describe microbial ecologies in host-associated environments, resulting in both niche-specific and 'core' metabolic and other pathways maintained throughout the human microbiome. The extent to which phylogeny and functional traits are linked in host-associated microbes, as opposed to unlinked by mechanisms, such as lateral transfer, remains to be determined. However, the definition of these functional assembly rules within microbial communities using controlled model systems and integrative 'omics' represents a fruitful opportunity for molecular systems ecology.

    Other authors
    See publication
  • Automatic Ventricle Chamber Segmentation Using a Regression Neural Network Initialization Based Active Shape Model

    AMIA Clinical Research

    Other authors

Projects

View Afrah’s full profile

  • See who you know in common
  • Get introduced
  • Contact Afrah directly
Join to view full profile

Other similar profiles

Explore collaborative articles

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Others named Afrah Shafquat