The general theme of our research is to tackle biological problems with advanced computational and statistical methods. We develop sophisticated algorithms for sequence alignment (e.g. bwa, minimap2 and chromap), sequence assembly (e.g. miniasm, hifiasm and hifiasm-meta) and pangenome analysis (e.g. minigraph). We design widely used formats (e.g. SAM and GFA) and also work on data compression (e.g. ropebwt2 and bgt), population genetics (e.g. PSMC) and single-cell sequencing (e.g. lianti). Some of our tools are essential to the applications of high-throughput sequence data and among the most widely used in the field of bioinformatics. Please see our GitHub portal for more tools we developed in recent years.
We are part of the department of Biomedical Informatics of Harvard Medical School and the department of Data Science of Dana-Farber Cancer Institute.