My research interest lies in utilizing large-scale genomics technologies, and computational and statistical tools to systematically study medical and population genetics/genomics. In medical genetics, I am particularly interested in understanding the genetic etiology of common complex human diseases. In population genetics, I am interested in studying the human evolutionary history indicated by genetic signatures in the human genome.
Development of 1000Genomes data processing tools
Reliable identification of genetic variants in re-sequencing data is the essential goal of the 1000 Genomes Project and is particular crucial in the exon-region sequencing endeavor. In the full-scale 1,000 Genomes Project, it is planned that the exonic regions will be sequenced at a much higher coverage than the rest of the genome. This presents a unique opportunity and challenge for bioinformatics pipeline development, as there is yet no software designed specifically for processing high coverage targeted sequencing data. Building on our extensive experience in analyzing the high coverage data from the 1000 Genomes Pilot 3 project, we aim to develop an integrated data processing pipeline and to develop a set of metrics in order to identify genomic variations for downstream analysis.
Medical genetics of common complex diseases
Common complex diseases such as cardiovascular disease, cerebrovascular disease, cancer, and diabetes account for most of the mortalities and morbidities in modern societies. Studies suggested strong influence of genetic variants in disease susceptibilities. Recent advances by large-scale association studies have uncovered many underlying predisposed regions for most of the common diseases. These efforts were paving the way to precisely pinpoint the causal genetic variants and understand the pathogenesis of complex diseases.
Population genomics—Signature of recent positive selection in the human genome
In modern terms, natural selection operates on genetic variations, which provide both evidences to support the mechanism of natural selection and the materials for it to act upon. The selection pressure interacts with individual phenotypes, but ultimately the objects of selection exist within the DNA variations.
This Week in Genome Research December 23, 2009
... Meanwhile, a group of researchers from the Baylor College of Medicine, Rice University, and Washington University report that they have come up with a way to sift through large amounts of high-throughput re-sequencing data and pick out genetic variants without getting duped by sequencing errors. Their computational tool — called Atlas-SNP2 — takes into account sequence context in training datasets to help distinguish between errors and authentic SNPs with a less than 10 percent false-positive error rate and a false-negative error rate of five percent or so. ...