Selected Publications
(See also Google Scholar; star for equal contribution; † for corresponding authors; bold fontface for lab members/interns)
- Zhou Y, Song L, Li H† (2024) Full resolution HLA and KIR gene annotations for human genome assemblies. Genome Res.
[PMID: 38839374]
[Citations]
- Chu J, Rong J, Feng X, Li H† (2024) ntsm: an alignment-free, ultra-low-coverage, sequencing technology agnostic, intraspecies sample comparison tool for sample swap detection. Gigascience, 13:giae024.
[PMID: 38832466]
[Citations]
- Cheng H, Asri M, Lucas J, Koren S, Li H† (2024) Scalable telomere-to-telomere assembly for diploid and polyploid genomes with double graph. Nat Methods.
[PMID: 38730258]
[Citations]
- Li H†, Durbin R† (2024) Genome assembly in the telomere-to-telomere era. Nat Rev Genet.
[PMID: 38649458]
[Citations]
- Feng X, Li H† (2024) Evaluating and improving the representation of bacterial contents in long-read metagenome assemblies. Genome Biol, 25:92.
[PMID: 38605401]
[Citations]
- Zhang Y, Chu J, Cheng H, Li H† (2023) De novo reconstruction of satellite repeat units from sequence data. Genome Res, 33:1994-2001.
[PMID: 37918962]
[Citations]
- Guo Y, Feng X, Li H† (2023) Evaluation of haplotype-aware long-read error correction with hifieval. Bioinformatics, 39:btad631.
[PMID: 37851384]
[Citations]
- Huang N, Li H† (2023) Compleasm: a faster and more accurate reimplementation of BUSCO. Bioinformatics, 39:btad595.
[PMID: 37758247]
[Citations]
- Song L, Bai G, Liu XS, Li B†, Li H† (2023) Efficient and accurate KIR and HLA genotyping with next-generation sequencing data. Genome Res, 33:923-931.
[PMID: 37169596]
[Citations]
- Liao W-W*, Asri M*, Ebler J*, …, Garrison E†, Marschall T†, Hall IM†, Li H†, Paten B† (2023) A draft human pangenome reference. Nature, 617:312-324.
[PMID: 37165242]
[Citations]
- Deorowicz S†, Danek A, Li H† (2023) AGC: compact representation of assembled genomes with fast queries and updates. Bioinformatics, 39:btad097.
[PMID: 36864624]
[Citations]
- Li H (2023) Protein-to-genome alignment with miniprot. Bioinformatics, 39:btad014.
[PMID: 36648328]
[Citations]
- Zhang H, Wu S, Aluru S†, Li H† (2022) Fast sequence to graph alignment using the graph wavefront algorithm. arXiv:2206.13574 (preprint).
[PMID: ]
[Citations]
- Tan K-T, Slevin MK, Meyerson M†, Li H† (2022) Identifying and correcting repeat-calling errors in nanopore sequencing of telomeres. Genome Biol., 23:180.
[PMID: 36028900]
[Citations]
- Feng X, Cheng H, Portik D, Li H† (2022) Metagenome assembly of high-fidelity long reads with hifiasm-meta. Nat Methods, 19:671-674.
[PMID: 35534630]
[Citations]
- Cheng H, Jarvis ED, Fedrigo O, Koepfli K-P, Urban L, Gemmell NJ, Li H† (2022) Haplotype-resolved assembly of diploid individuals without parental data. Nat Biotechnol, published online.
[PMID: 35332338]
[Citations]
- Kokot M, Gudys, Li H†, Deorowicz S† (2022) CoLoRd: compressing long reads. Nat Methods, 19:441-444.
[PMID: 35347321]
[Citations]
- Zhang H*, Song L*, Wang X, Cheng H, Wang C, Meyer C. A, Liu T, Tang M, Aluru S, Yue F, Liu XS†, Li H† (2021) Fast alignment and preprocessing of chromatin profiles with Chromap. Nat Commun, 12:6566.
[PMID: 34772935]
[Citations]
- Li H (2021) New strategies to improve minimap2 alignment accuracy. Bioinformatics, 37:4572-4574.
[PMID: 34623391]
[Citations]
- Zhang H, Li H, Jain C, Cheng H, Au KF†, Li H†, Aluru S† (2021) Real-time mapping of nanopore raw signals. Bioinformatics, 37:i477-i483.
[PMID: 34252938]
[Citations]
- Feng X, Li H† (2021) Higher rates of processed pseudogene acquisition in humans and three great apes revealed by long read assemblies. Mol Biol Evol, 38:2958-2966.
[PMID: 33681998]
[Citations]
- Li H†, Rong J (2021) Bedtk: Finding Interval Overlap with Implicit Interval Tree. Bioinformatics, 37:1315-1316.
[PMID: 32966548]
[Citations]
- Garg S†, Fungtammasan A, Carroll A, Chou M, Schmitt A, …, Chin C-S†, Church GM†, Li H† (2021) Chromosome-scale haplotype-resolved assembly of human genomes. Nat Biotechnol, 39:309-312.
[PMID: 33288905]
[Citations]
- Xing D*, Tan L*, Chang C-H, Li H†, Xie XS† (2021) Accurate SNV detection in single cells by transposon-based whole-genome amplification of complementary strands. Proc Natl Acad Sci, 118:e2013106118.
[PMID: 33593904]
[Citations]
- Cheng H, Concepcion GT, Feng X, Zhang H, Li H† (2021) Haplotype-resolved de novo assembly with phased assembly graphs with hifiasm. Nat Methods, 18:170-175.
[PMID: 33526886]
[Citations]
- Li H†, Feng X, Chu C (2020) The design and construction of reference pangenome graphs with minigraph. Genome Biol, 21:265.
[PMID: 33066802]
[Citations]
- Ruan J† and Li H† (2020) Fast and accurate long-read assembly with wtdbg2. Nat. Methods, 17:155-158.
[PMID: 31819265]
[Citations]
- Wenger AM*, Peluso P*, Rowell WJ, Chang PC, Hall RJ, …, Li H, …, Rank DR†, Hunkapiller MW† (2019) Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome. Nat Biotechnol., 37:1155-1162.
[PMID: 31406327]
[Citations]
- Li H (2019) Identifying centromeric satellites with dna-brnn. Bioinformatics, 35:4408-4410.
[PMID: 30989183]
[Citations]
- Li H (2018) Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics, 34:3094-3100.
[PMID: 29750242]
[Citations]
- Tan L*, Xing D*, Chang CH, Li H, Xie XS† (2018) Three-dimensional genome structures of single diploid human cells. Science, 361:924-928.
[PMID: 30166492]
[Citations]
- Li H†, Bloom JM, Farjoun Y, Fleharty M, Gauthier L, Neale B†, MacArthur D† (2018) A synthetic-diploid benchmark for accurate variant-calling evaluation. Nat Methods, 15:595-597.
[PMID: 30013044]
[Citations]
- Chen C*, Xing D*, Tan L*, Li H*, Zhou G, Huang L, Xie XS† (2017) Single-cell whole-genome analyses by Linear Amplification via Transposon Insertion (LIANTI). Science, 356:189-194.
[PMID: 28408603]
[Citations]
- Mallick S*, Li H*, Lipson M*, Mathieson I*, Gymrek M, Racimo F, …, Reich D† (2016) The Simons Genome Diversity Project: 300 genomes from 142 diverse populations. Nature, 538:201-206.
[PMID: 27654912]
[Citations]
- Li H (2016) Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences. Bioinformatics, 32:2103-2110.
[PMID: 27153593]
[Citations]
- Li H (2016) BGT: efficient and flexible genotype query across many samples. Bioinformatics, 32:590-592.
[PMID: 26500154]
[Citations]
- The 1000 Genomes Project Consortium (2015) A global reference for human genetic variation. Nature, 526:68-74.
[PMID: 26432245]
[Citations]
- Li H (2015) FermiKit: assembly-based variant calling for Illumina resequencing data. Bioinformatics, 31:3694-3696.
[PMID: 26220959]
[Citations]
- Li H (2015) BFC: correcting Illumina sequencing errors. Bioinformatics, 31:2885-2887.
[PMID: 25953801]
[Citations]
- Palkopoulou E, Mallick S, Skoglund P, Enk J, Rohland N, Li H, Omrak A, Vartanyan S, Poinar H, Götherström, …, Dalén L† (2015) Complete genomes reveal signatures of demographic and genetic declines in the woolly mammoth. Curr Biol., 25:1395-400.
[PMID: 25913407]
[Citations]
- Do R, Balick D, Li H, Adzhubei I, Sunyaev S†, Reich D† (2014) No evidence that selection has been less effective at removing deleterious mutations in Europeans than in Africans. Nat Genet., 47:126-31.
[PMID: 25581429]
[Citations]
- Fu Q†, Li H, Moorjani P, Jay F, Slepchenko SM, …, Reich D†, Kelso J†, Viola TB†, Pääbo S (2014) Genome sequence of a 45,000-year-old modern human from western Siberia. Nature, 514:445-449.
[PMID: 25341783]
[Citations]
- Li H (2014) Fast construction of FM-index for long sequence reads. Bioinformatics, 30:3274-3275.
[PMID: 25107872]
[Citations]
- Li H (2014) Towards Better Understanding of Artifacts in Variant Calling from High-Coverage Samples. Bioinformatics, 30:2843-2851.
[PMID: 24974202]
[Citations]
- Prüfer K, Racimo F, Patterson N, Jay F, …, Li H, …, Slatkin M†, Reich D†, Kelso J, Pääbo S† (2014) The complete genome sequence of a Neanderthal from the Altai Mountains. Nature, 505:43-49.
[PMID: 24352235]
[Citations]
- Li H (2012) Exploring single-sample SNP and INDEL calling with whole-genome de novo assembly. Bioinformatics, 28:1838-1844.
[PMID: 22569178]
[Citations]
- Li H (2011) A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics, 27:2987-2993.
[PMID: 21903627]
[Citations]
- Li H†, Durbin R† (2011) Inference of human population history from individual whole-genome sequences. Nature, 475:493-496.
[PMID: 21753753]
[Citations]
- Li H (2011) Improving SNP discovery by base alignment quality. Bioinformatics, 27:1157-1158.
[PMID: 21320865]
[Citations]
- Li H (2011) Tabix: Fast retrieval of sequence features from generic TAB-delimited files. Bioinformatics, 27:718-719.
[PMID: 21208982]
[Citations]
- Li H†, Homer N (2010) A survey of sequence alignment algorithms for next-generation sequencing. Brief Bioinform, 11:473-83.
[PMID: 20460430]
[Citations]
- Green RE*,†, Krause J*, Briggs AW*, Maricic T*, Stenzel U*, Kircher M*, Patterson N*, Li H, …, Reich D†, Pääbo S† (2010) A draft sequence of the Neandertal genome. Science, 328:680-684.
[PMID: 20448178]
[Citations]
- Li H, Durbin R† (2010) Fast and accurate long-read alignment with Burrows-Wheeler Transform. Bioinformatics, 26:589-95.
[PMID: 20080505]
[Citations]
- Li H, Durbin R† (2009) Fast and accurate short read alignment with Burrows-Wheeler Transform. Bioinformatics, 25:1754-1760.
[PMID: 19451168]
[Citations]
- Li H*, Handsaker B*, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R† (2009) The Sequence Alignment/Map format and SAMtools. Bioinformatics, 25:2078-2079.
[PMID: 19505943]
[Citations]
- Li H, Ruan J and Durbin R (2008) Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res., 18:1851-8.
[PMID: 18714091]
[Citations]
- Ruan J*, Li H*, Chen Z, Coghlan A, Coin LJ, …, Wang J†, Durbin R† (2008) TreeFam: 2008 Update. Nucleic Acids Res, 36:D735-740.
[PMID: 18056084]
[Citations]
- Li H*, Guan L*, Liu T*, Zheng W, Wong G† and Wang J† (2007) A cross-species alignment tool (CAT). BMC Bioinformatics, 8:439.
[PMID: 17880681]
[Citations]
- Li H, Coghlan A, Ruan J, Coin LJ, Hériché JK, …, Durbin R† (2006) TreeFam: a curated database of phylogenetic trees of animal gene families. Nucleic Acids Res., 34:D572-580.
[PMID: 16381935]
[Citations]
- Li H*, Liu J*, Xu Z*, Jin J, Fang L, …, Hao B-L† (2005) Test data sets and evaluation of gene prediction programs on the rice genome. J Comput Sci & Technol., 20:446-53.
[PMID: ]
[Citations]