Skip to main content

The Lin Lab, led by Dr. Xihong Lin at the Harvard T.H. Chan School of Public Health, advances genomics and human disease research using innovative statistical and machine learning methods. Our team analyzes large-scale genetic, genomic, and health data to study complex diseases, focusing on areas like whole genome sequencing, functional variant annotation, polygenic risk prediction, and gene-environment interactions. We develop scalable tools, including FAVOR and STAAR, and prioritize improving prediction accuracy for underrepresented populations.

Research

The Lin Lab is at the forefront of developing and applying scalable statistical and machine learning methods for analyzing large datasets encompassing the genome, exome, exposome, and phenome. Our research spans diverse areas, including big and complex genetic and genomic data studies, variant functional annotations, gene-environment interactions, multi-phenotype analysis, polygenic risk prediction, and heritability estimation. Additionally, we explore integrative analysis of various data types, Mendelian randomization, causal mediation analysis, federated and transferred learning, single-cell genomics, complex observational studies, and the analysis of COVID-19 epidemic data.

The Lin Lab recently developed methods for functionally-informed rare variant association testing in Whole Genome Sequencing (WGS) and biobank datasets (STAARSTAARPipeline), including meta-analysis (metaSTAAR) and multi-trait analysis (multiSTAAR). Current extensions to this work include integrating single-cell sequencing data into such analyses to boost statistical power and interpretation (cellSTAAR) and extensions to analyze gene-environment interactions.  As part of these efforts, the FAVOR database (favor.genohub.org) was created to provide comprehensive genome-wide annotations.  We also helped develop improved methods for polygenic risk scores in diverse populations (CT-SLEB). The Lin Lab plays an instrumental role in advancing large-scale whole-genome and whole-exome analysis through its significant contributions to the development of innovative genetic-based ethnicity prediction methods and rigorous quality control protocols. 

The Lin Lab is also dedicated to researching the genetic, environmental, and lifestyle factors that contribute to Lung Cancer. We are actively involved in developing innovative methods to identify and interpret crucial genetic variants relevant to squamous cell lung cancer, adenocarcinoma lung cancer, and Non-Small Cell Lung Cancers (NSLCs).(multiomic annotationsmoking history).

  • WGS association studies: UKBiobank, TOPMed, GSP, AllofUs, etc.
  • Integration of single-cell & multi-omics data in WGS analysis
  • Prioritizing causal variants with functional annotations (IGVF consortium)
  • Single-cell RNA-sequencing & functional annotation tool development
  • Quality control for large-scale WGS/WES data & rare variant analysis
  • Scalable methods for polygenic risk score construction & improving risk prediction accuracy
  • Lung cancer epidemiology (ILCCO), cardiovascular diseases, & sleep apnea research
  • Statistical genetics/genomics, causal inference, and Mendelian Randomization
  • Pathway/network analysis, and integrative data analysis
  • Focus on common diseases, genes, environment, epigenetics
  • Nonparametric/semiparametric regression, mixed models, correlated data analysis
  • Measurement error in genetic epidemiology/environmental genetics/genomics research