Skip to main content

The Lin Lab, led by Dr. Xihong Lin at the Harvard T.H. Chan School of Public Health, advances genomics and human disease research using innovative statistical and machine learning methods. Our team analyzes large-scale genetic, genomic, and health data to study complex diseases, focusing on areas like whole genome sequencing, functional variant annotation, polygenic risk prediction, and gene-environment interactions. We develop scalable tools, including FAVOR and STAAR, and prioritize improving prediction accuracy for underrepresented populations.

Who We Are

Directed by Dr. Xihong Lin at the Harvard T.H. Chan School of Public Health, the Lin Lab strives to advance our understanding of genomics and human disease. Our interdisciplinary team, comprising research scientists, postdoctoral fellows, software developers, and doctoral students, brings diverse expertise to this mission.

Our research focuses on developing and applying scalable statistical and machine learning methods to explore massive genetic, genomic, epidemiological, and health datasets. Key areas of study include Whole Genome Sequencing, biobanks, Electronic Health Records, gene-environment interactions, multiple phenotype analysis, polygenic risk prediction, and heritability estimation.

We also tackle causal inference challenges, such as Mendelian Randomization and mediation analysis, while advancing methods like federated learning and integrative analysis of whole-genome and single-cell sequencing data. Additionally, we analyze epidemiological and COVID-19 data to inform public health initiatives.

In phenotypic analysis, we develop nonlinear machine learning and latent variable methods for phenotypic refinement. Our team has also pioneered scalable algorithms to construct polygenic risk scores, improving prediction accuracy, particularly in underrepresented populations. Together, these efforts contribute to our overarching goal of advancing human health through data-driven research.

To support our work, we have developed innovative software programs, including FAVOR, STAAR, and STAARpipeline, alongside websites for our COVID-19 spread mapper research and Functional Annotation Variants projects. These tools enhance our ability to analyze and interpret complex genomic and health data.

Photo Gallery