Skip to main content

The Lin Lab, led by Dr. Xihong Lin at the Harvard T.H. Chan School of Public Health, advances genomics and human disease research using innovative statistical and machine learning methods. Our team analyzes large-scale genetic, genomic, and health data to study complex diseases, focusing on areas like whole genome sequencing, functional variant annotation, polygenic risk prediction, and gene-environment interactions. We develop scalable tools, including FAVOR and STAAR, to enhance the accuracy and efficiency of genetic analysis.

Location

655 Huntington Ave, Boston, MA 02115

Who We Are

The Lin Lab, directed by Dr. Xihong Lin at the Harvard T.H. Chan School of Public Health, conducts cutting-edge research at the intersection of genomics, biostatistics, and public health. Our interdisciplinary team of research scientists, postdoctoral fellows, software developers, and doctoral students brings a wide range of expertise to this mission.

We focus on developing and applying scalable statistical and machine learning methods to analyze large-scale genetic, genomic, epidemiological, and health datasets. Core research areas include whole genome sequencing, biobanks, electronic health records, gene-environment interactions, multiple phenotype analysis, polygenic risk prediction, and heritability estimation.

Our team also addresses complex challenges in causal inference—such as Mendelian Randomization and mediation analysis—and advances methods for integrative analysis of whole-genome and single-cell sequencing data. In addition, we analyze epidemiological and COVID-19 data to support evidence-based public health initiatives.

In the area of phenotypic analysis, we are developing nonlinear machine learning and latent variable methods to improve phenotypic refinement. We have also created scalable algorithms for polygenic risk score construction, enhancing prediction accuracy across a range of complex traits.

To support these efforts, we have developed several software tools, including FAVOR, STAAR, and STAARpipeline, as well as interactive platforms for our COVID-19 spread mapper and functional variant annotation projects. These resources enable robust and reproducible genomic and health data analysis.

Photo Gallery