Genetic Epidemiology of Diverse Populations

The first 3 principal components of the PAGE Study, highlighting the continuous spectrum of diversity found within admixed populations.

The majority of disease burden within the United States is found in minority populations, yet an overwhelming bulk of genetic studies have been conducted in European ancestry populations. To address this knowledge gap, well powered and large-scale genetic association studies of complex traits are needed in diverse global populations. I recently led the analysis of one of the largest non-European genome-wide association studies (GWAS) to-date in 50,000 multi-ethnic individuals across 26 common traits. As part of the Population Architecture using Genomics and Epidemiology (PAGE)-II Study, a total of 50,000 participants, self-identified primarily as Asian, Native Hawaiian, Hispanic/Latino, or African-American, were genotyped on a novel genotyping array.

In addition to leading development of the GWAS scaffold for the novel Multi-Ethnic Genotyping Array (MEGA), along with Illumina and an international academic consortium of researchers, I am currently leading efforts to characterize the extensive genetic mixture which occurs when previously-isolated populations meet (admixture) within these populations on both a continental and sub-continental level.


Genomics currently runs the risk of exacerbating existing health disparities, which in the United States disproportionately impact admixed populations. Our work addresses the often conflation of genetic ancestry and race/ethnicity as contributing factors to a lack of transferability for both GWAS and PRS to better understanding the role of gene-environment interactions in population-specific epidemiology.


Genetic association signal for cryptosporidiosis in the first year of life among Bangladeshi infants. (Wojcik et al, 2020)

By far the most influential force on the human genome has been strong selective pressure from pathogens and infectious disease. The Red Queen hypothesis describes this competitive arms race between a pathogen and its host, in which both must out-evolve the other for survival. The footprints from these combats can be seen across the human genome and are not identical across individuals. While it is necessary for the individual to be exposed to pathogen, it is not always sufficient for infection. This heterogeneity in infection, as well as the sequelae, has been seen in the literature.

I am currently involved in elucidating the genetic susceptibility to infectious disease through numerous international collaborations, such as the development of enteric infection within Bangladeshi infants and HIV disease progression within a community in Rakai, Uganda. By integrating epidemiological data with various genomic information, such as a genome-wide genotyping array, viral genomics, or antibody repertoires, we are able to see a more complete picture of the underlying biology.