In the News

In the News

Graciela Gonzalez Hernandez, PhD, Associate Professor of Informatics

Growing up in Juarez, Mexico, Graciela Gonzalez Hernandez spent her free time climbing walls, leaping from roof to roof, and competing in every academic challenge that came her way, including the national science fair. She did everything her seven brothers did, and then some. Computers held an early fascination: She learned to program in fourth grade and excelled at it in high school, where her calculus teacher dubbed her “Amazing Grace”—a reference to computer scientist and US Navy Rear Admiral Grace Hopper. To pursue her Bachelor’s, MS and PhD in computer science at the University of Texas in nearby El Paso, she took the fast way across the border—via bicycle. “There were very few women in the classroom,” she remembers, “but frankly, I hardly noticed.”

Perhaps those “adrenaline fueled” early years provided a running start for several boundaries she would later jump. She became one of the first informaticians using natural language processing (NLP) methods to mine social media data for mentions of adverse drug reactions—and secured one of the first R01s on the topic from the NIH.  She also helped launch the Biomedical Informatics Department at Arizona State University (ASU). In 2016, she moved with her husband and five children across the nation to join Penn as an Associate Professor of Informatics.

Dr. Gonzalez Hernandez became interested in NLP by necessity, shortly after joining ASU in 2005. She had planned to work with Dr. Chitta Baral on a logic network to represent protein pathways, the building blocks of life, in such a way that the mechanisms of disease and healthy function could be simulated. By simply introducing a “faulty connection” –a protein over- or under-expressed, or not present—the program would show the expected disease state and, possibly, also explain what caused it. “But that was easier said than done,” she says. “Most of this information is not in any database—even now, most of it is only in published papers.

“So we went to another plan: find a program that extracts the information from those papers. But there wasn’t one that would do the job. We wanted the cake and the icing too, but we had to step back and forget about the icing—and then step back again and work on getting the right ingredients, those pieces of information in millions of published papers.”

With her PhD students, she participated in NLP competitions, coming in ahead of many established labs, and released a gene-name tagger, BANNER, that is still the most widely used around the world. She thought she would return to working in artificial intelligence—but the challenges in NLP kept coming. In particular, the rise of social media offered all kinds of new insights about people’s health. Just this fall, the NIH/NLM renewed Dr. Gonzalez Hernandez’s R01 “Social Media Mining for Pharmacovigilance” for $2.4 million, covering work through 2022. At the same time, she continues work on her other R01 from the NIH/NIAID, with colleague Matthew Scotch, finding geographic location mentions in published literature and linking them to GenBank to help precisely model viral spread. She has also has funding from the National Board of Medical Examiners to develop technologies for automatic assessment.

She still encounters skeptics who question how solid information can possibly come from the fluff of social media. Her answer: “We see that the trends in social media match the known side effects of certain medications, which validates our approach. If we can demonstrate that with NLP methods we can find what we already know, then we may also be able to find something that is not known and turns out to be accurate.

“We start with something easy: Can we distinguish someone who has Alzheimer’s from someone who doesn’t, by analyzing their writing? Sure, there are other ways to diagnose the disease, with greater accuracy. But the important thing is to discover the features of Alzheimer’s in the writing: When you can’t see the patient, just the writing, can you tell?” Those questions lead right into her next challenge:  “Could you possibly tell before that person gets the diagnosis?”

Michael Z. Levy, PhD, Associate Professor of Epidemiology

Once they’d seen Michael Levy’s answer to the question “How did you get into science?” the PhD admissions committee at Emory University had to say yes. After graduating from Amherst College, Dr. Levy taught English in Santiago, Chile. Things took a turn when he befriended a clown and traveled with a circus. At the tip of Chile he spent time reading up on evolutionary biology and decided to go to graduate school. His PhD focus (Population Biology, Ecology and Evolution) was decided. But that still didn’t make his path predictable.

He would come to be best known for work on Chagas disease, an insect-borne disease common in South America, Central America and Mexico. Along the way, however, he would intern for the Smithsonian in Panama; segue back to his hometown for courses at Penn; take organic chemistry at the University of Panama (for only $26); and work on tuberculosis in Peru. During his PhD work, he would return to Peru and finally address Chagas.

He started work on Chagas in 2004, and the rest is history. “In Arequipa, as in many other areas, a whole generation has been protected from Chagas disease. The Ministry of Health and our team are now in the 'surveillance stage,' trying to ensure transmission isn't allowed to start up again,” Dr. Levy says.

In 2011 he began engaging Penn students in fighting a local plague: bed bugs. Then bed bugs hit Peru, and Dr. Levy’s lab in Arequipa found that they, too, are very capable of transmitting the Chagas parasite.

Multiple next steps beckon. Should bed bugs be on the Chagas agenda? “In Peru there is a potential haystack of Chagas disease, and the bed bugs are still a needle,” he comments. “In the States, we have a haystack of bed bugs, and the Chagas is the needle.” And then there’s the “last mile” question: Should he aim to eliminate the triatomine bug, the usual Chagas suspect, from Arequipa? Stay tuned for the next chapter in his story.

Qi Long, PhD Professor of Biostatistics

Qi Long, PhD, is a man constantly in motion—both figuratively and, as often possible, literally. Early on he gravitated both toward biology—he studied biochemistry at the University of Science and Technology of China—and toward all things numeric. A PhD in biostatistics at the University of Michigan proved a great fit—not just because the field married those two loves, but because the Big Ten school got him started on a new analytical passion: football. 

He’s accrued a huge range of research subject matter—cancer, cardiovascular disease, diabetes, kidney disease, mental health, stroke—not always by design but because over time at Emory University, his home base before he came to Penn in March, he successfully cultivated collaborations with so many different biomedical researchers. His research group works with numerous types of “big” biomedical data. Yet on any given day, as he walks from Center City to West Philly (so as not to get stuck in traffic; he hates that), he might be thinking about the methodological Holy Grail he is still reaching for.

 “We use omics data to build a prediction model for disease risk in one project. Then we use electronic health records (EHRs) or mobile-device data for certain others,” he says. “But ultimately I want to integrate big data from all those domains to advance precision medicine and population health—and that presents substantial analytical challenges.”

The Perelman School of Medicine’s collaborative culture and the DBEI’s three-discipline structure attracted him here, as did his appointment directing the Biostatistics Core at the Abramson Cancer Center. “I also appreciate Penn’s attention to diversity in its broadest sense:  both the breadth of the research and the range of cultural backgrounds. I have always cherished that,” he says. “I find it enriches scientific inquiry overall.”

Biostatistics Alum Arman Oganisian: To Show Healthcare Costs, Make Few Assumptions

Hometown: Born in Yerevan, Armenia; grew up in Worcester, MA

Education: BA in Quantitative Economics, Providence College
MS in Biostatistics, University of Pennsylvania
Pursuing PhD in Biostatistics

From the 2008 financial crash to biostatistics: After college, I worked for a consulting firm in financial practice, using statistical models to assess the economic risk of derivatives that went toxic during the 2008 crisis. When that work started drying up, I moved to the healthcare practice, where we used statistical models to assess the risk and effectiveness of drugs our pharmaceutical clients developed. (That work wasn’t so different, it turned out. I realized what I cared about most was statistical modeling.)

I developed a real passion for biostatistics—to the point where I was reading textbooks for fun. Most of my managers had PhDs in biostatistics, and they encouraged me to apply to programs.

Why Penn biostatistics: I was interested in causal inference, and I knew Penn had the Center for Causal Inference, a Causal reading group with members from the biostatistics and Wharton statistics faculties, and many pioneers in causal inference. I felt I’d have really good mentors and I do. My advisors, [Professor] Nandita Mitra and [Adjunct Professor] Jason Roy, have been immensely valuable and supportive.

Economics? Statistics? How about both: Economics taught me a principled, methodical framework for thinking about prices and costs. If I think healthcare service prices are “too high,” for example, I can ask myself which assumptions are being violated in the real world that are causing them to deviate from the ideal.

From statistics I’ve learned, especially, the importance of defining a study question precisely. Say we want to estimate the healthcare costs associated with a particular treatment. We should be very precise with what we mean by “costs.” Costs for the insurer? The patient? The provider? Costs accumulated over what follow-up period? What types of costs: inpatient, outpatient, prescriptions? If we don’t understand precisely what we’re measuring, our study is doomed from the beginning.

Why we need a method that doesn’t assume much: Right now my colleagues and I are comparing healthcare costs of different treatments for endometrial cancer, and also investigating hospitalization costs specifically. These are challenging problems. The sticker cost of one treatment may be much higher, but if it’s associated with fewer follow-up hospitalizations, it may be associated with lower total costs overall. Patients assigned to a given treatment may just be sicker than those assigned to another. There can be many patients with little-to-no costs and clusters of very high-cost patients.

Modeling these costs becomes very difficult. If we assumed, for example, that costs have a bell-shaped distribution, we would be very wrong. Bayesian nonparametric methods address this by making very few assumptions about cost distributions. So they can capture the complexities of cost data, yielding better estimates of causal effect—which is what we care about in health policy.

M. Elle Saine: Murder Mysteries, 9/11, and the Epidemiologist’s Toolbox

Hometown: Lives in Philly’s Manayunk neighborhood and considers it her home town.

Education: BA in Anthropology: Human Biology Tract, Temple University
MA in Skeletal Biology, New York University
Pursuing MD and PhD in Epidemiology

How murder mysteries inspired her early interests: I’ve been fascinated by forensics for as long as I can remember. When I was a kid, my mom wrote murder mysteries. I was sick a lot, and when I stayed home, we watched true crime shows and she shared stories about her research. When I got older, I wanted to investigate human rights violations and genocides, and help to identify remains and return them to families for burial. I see myself engaging in humanitarian outreach when I become a physician.

Her connection to 9/11: During my master’s studies a decade after the tragedy, I interned with the New York Office of the Chief Medical Examiner. I spent a couple weeks one summer helping prepare unidentified remains from the World Trade Center for analysis with a new DNA method. It is still surreal to think about.

The remains were in white tents within a warehouse-like building. There was so much white, between the tents and the body bags. It was pristine and cold, even in August. I had a very small role, but I got to know archaeologists who still were excavating and the scientists who had prepared every round of DNA resequencing. These individuals were the guardians of the remains, dedicated to helping families and friends get a small piece of closure.

Her inspiration for med school: For the second part of that internship, I worked with forensic pathologists. I accompanied one to testify in court, and I could see how much her work meant to the parents of the victim. I began thinking of becoming a forensic pathologist and anthropologist.

Then last year, I was on a Medicine Service at HUP; I had been there for less than a week when I knew that I would go into medicine. I love working with patients and their families in the hospital; there is nothing so fulfilling as knowing that during one of the hardest moments someone will ever face, you can do something to make it a little easier.

Her Leonard Davis Institute pilot grant and why she’s in epi: Through my grant, I am evaluating a scale for measuring how patients with HIV perceive stigma and modifying (and hopefully validating) it for patients with hepatitis C. I was inspired by my work with [Associate Professor] Vin Lo Re, [Professor] Fran Barg, and [Associate Professor] Robert Gross, about the experiences of people who have only hepatitis C vs. those of people who have both hepatitis C and HIV. I am fascinated with how we can measure stigma—something so important to people’s lives, yet so intangible.

I love the endless diversity of questions that epidemiology can investigate. When people ask what the field is, I tell them any medical research involving people that doesn’t take place in a lab likely contains an aspect of epidemiology.

Over time, my toolbox has gotten a lot bigger. When I think of my future, I see myself as an anthropologist with large sample sizes.

Jordan Dworkin: What We Talk About When We Talk About Neuroimaging

Hometown: Rochester, NY

Education: BS in Psychology (minor in Mathematics and Statistics), Haverford College
Pursuing PhD in Biostatistics

Interest that took him by surprise: Coming into the biostatistics program I knew almost nothing about neuroimaging and was fairly convinced that I wanted to work on methods for clinical trials. Then a professor at another university recommended that I get in touch with [Assistant Professor of Biostatistics] Taki Shinohara. I went out of my way to meet with Taki and ended up working with him for two of my first three lab rotations—and every semester since then.

Many of the statistical methods we’ve been developing aim to detect subtle features of brain lesions. These types of methods can potentially help clinicians make accurate diagnoses and prognoses for multiple sclerosis patients.

How he’s making an impact (now there IS an app for that):  Established investigators in a field tend to know intuitively which topics are up-and-coming or declining, and which people usually study in tandem. But it can be quite daunting for those who are new to get even a surface-level picture of the field; I experienced this first-hand. I decided to create a place where new graduate students or young researchers could go to get a feel for the neuroimaging landscape.

I also wanted to help people visualize new connections they could draw, between topics that historically have been unrelated in neuroimaging research.

What his project taught him:  The topics that have spiked most dramatically in the past few years are resting state and functional connectivity. This research studies not only individual regions of the brain, but the relationships between regions. Thanks to researchers like Penn’s Danielle Bassett, we now have valuable new insights about the brain in health and disease.

In addition, on the methods side researchers are working tirelessly to address the issue of motion in the data—specifically the poor image quality that results from patients moving in the scanner. And prediction has become popular—implying a greater focus on obtaining accurate diagnoses and prognoses from the data.

It’s interesting to me that structural connectivity isn’t yet among the most popular topics. This suggests there might be room for more people to investigate and interpret the physical connections between brain regions.

What statistics can teach you about Republican candidates and scary TV: Many people outside of scientific research tend to be bored or intimidated by statistics. As a future statistician, I believe strongly that some statistical literacy is an important tool for everyone. So in my free time I’ve been trying to apply statistical methods to questions that appeal to a broader audience. In summer 2016, I performed a simulation of the first 15 Republican primaries to determine if a different method of voting would have changed their outcomes. You can read about that here. And more recently I analyzed the emotional patterns in episodes of the sci-fi show Stranger Things. Here’s what I found out about the show’s popularity

About Us

To understand health and disease today, we need new thinking and novel science —the kind  we create when multiple disciplines work together from the ground up. That is why this department has put forward a bold vision in population-health science: a single academic home for biostatistics, epidemiology and informatics. 

© 2023 Trustees of the University of Pennsylvania. All rights reserved.. | Disclaimer

Follow Us