Faculty Profile

Tandy Warnow

Computer Science
Tandy Warnow
Tandy Warnow
Associate Head of Department
  • Department Head Office
  • Department Office
3235 Siebel Center for Comp Sci
201 N. Goodwin Ave.
Urbana Illinois 61801
(217) 300-3087
Founder Professor of Engineering
3235 Siebel Center for Comp Sci
201 N. Goodwin Ave.
Urbana Illinois 61801
(217) 300-3087

Affiliation

  • Department Head Office
  • Department Office

Primary Research Area

  • Bioinformatics and Computational Biology

Education

  • BA Mathematics, The University of California at Berkeley, 1984
  • PhD Mathematics, The University of California at Berkeley, 1991

Biography

Tandy received her PhD in Mathematics at UC Berkeley (1991) under the direction of Gene Lawler, and did postdoctoral training with Simon Tavare and Michael Waterman at USC (1991-1992). After spending a year in the Discrete Algorithms Group at Sandia National Laboratories in Albuquerque, NM, she joined the Computer and Information Sciences Department faculty at the University of Pennsylvania. Tandy joined the faculty at the University of Texas in 1998, where she was the David Bruton Jr. Centennial Professor of Computer Science. She is now a member of two departments at the University of Illinois - Computer Science and Bioengineering, where she is the Founder Professor of Engineering. She received the National Science Foundation Young Investigator Award in 1994, the David and Lucile Packard Foundation Award in Science and Engineering in 1996, a Radcliffe Institute Fellowship in 2006, and a Guggenheim Foundation Fellowship for 2011. She was elected a Fellow of the Association for Computing Machinery (ACM) in 2016 and of the International Society for Computational Biology (ISCB) in 2017 .

Academic Positions

  • The University of Illinois at Urbana-Champaign, Founder Professor of Computer Science, 2014-present. Associate Chair for Computer Science 2017-present. Affiliate in the departments of Mathematics, Electrical and Computer Engineering, Statistics, Plant Biology, Animal Biology, and Entomology. Affiliate in the National Center for Supercomputing Applications (NCSA), Coordinated Sciences Laboratory (CSL), and member of the Carl R. Woese Institute for Genomic Biology (IGB). Affiliate in the PEEC program.
  • The University of Texas at Austin, David Bruton Jr. Professor of Computer Science, 2003-2014
  • The University of Texas at Austin, Associate Professor of Computer Science, 1999-2003
  • The University of Pennsylvania, Associate Professor of Computer and Information Sciences, 1998-1999
  • The University of Pennsylvania, Assistant Professor of Computer and Information Sciences, 1993-1998

For more information

Teaching Statement

At the undergraduate level, I teach courses in discrete mathematics and algorithm design and analysis, and use problems from computational biology to demonstrate the applications of these skills and techniques to real world problems. At the graduate level, I teach a course CS/BioE 598AGB: Algorithmic Genomic Biology. The main focus of 598AGB is on phylogeny (evolutionary tree) estimation, but the course also covers the related problems of computing multiple sequence alignments, genome assembly, and analyzing microbiomes. Students will learn the mathematical and computational foundations in these areas, read the current literature, and do a team research project. The techniques involved include discrete algorithms, graph theory, simulations, and probabilistic analysis of algorithms. Course website: http://tandy.cs.illinois.edu/598-2016.html

Course Development

  • BIO 540/ CS 581 Algorithmic Genomic Biology. Course website: http://tandy.cs.illinois.edu/581-2017.html. Course description: The purpose of the course is to give each student enough background and training in the area of algorithmic genomic biology so that you will be able to do research in this area, and publish papers. Every year, two or more students from this course have done final projects that were subsequently published in major scientific journals; you can be one of them! The main focus of the course is on phylogeny (evolutionary tree) estimation, multiple sequence alignment, and genome-scale phylogenetics, which are problems that present very interesting challenges from a computational and statistical standpoint. Time permitting, we will also discuss computational problems in microbiome analysis, protein function and structure prediction, genome assembly, and even historical linguistics. Students will learn the mathematical and computational foundations in these areas, read the current literature, and do a team research project. The course is designed for doctoral students in computer science, computer engineering, bioengineering, mathematics, and statistics, and does not depend on any prior background in biology. The technical material will depend on discrete algorithms, graph theory, simulations, and probabilistic analysis of algorithms.

Research Statement

My research combines mathematics, computer science, probability, and statistics, in order to develop algorithms with improved accuracy for large-scale and complex estimation problems in phylogenomics and metagenomics. My major interests include multiple sequence alignment and phylogeny estimation (both gene trees and species trees) and metagenomic analysis, but I also work in Historical Linguistics. My current work aims to develop methods for ultra-large datasets (anywhere from 10,000 to 1,000,000 sequences), including datasets that are highly fragmentary and present other real world challenges. We use real data and perform massive simulations to evaluate the performance of methods that we develop, and also collaborate closely with biologists and linguists in data analysis.

Graduate Research Opportunities

My research is currently focused on four topics, and all have multiple open questions where graduate students (MS or PhD) would be helpful. Although the topics are described in terms of biological or linguistic data, the research is to develop novel computational methods that provide excellent accuracy and scalability. All these problems have deep mathematical foundations, and almost all involve NP-hard statistical estimation problems. There are opportunities to prove theorems (if you are mathematically inclined), develop and implement heuristics for NP-hard problems, develop parallel implementations of methods, and analyze datasets. Graduate students should be strong programmers, and mathematical intuition is very helpful as well.

No background is needed in biology or linguistics!!

1. Phylogenomics - estimation of species trees and/or phylogenetic networks from multiple loci. The main focus is on combining gene trees that exhibit conflict due to incomplete lineage sorting or horizontal gene transfer (or even duplication and loss).

2. Metagenomics - taxonomic characterization of very short sequences sampled from environmental samples.

3. Multiple sequence alignment - especially for very large datasets (up to one million sequences) that are highly fragmentary

4. Historical linguistics - and analysis of Indo-European and other language families.

Research Interests

  • Big Data
  • Computational Historical Linguistics
  • Metagenomics
  • Phylogenomics
  • Multiple sequence alignment
  • Phylogenetic tree estimation
  • Discrete and graph-theoretic algorithms

Books Authored or Co-Authored (Original Editions)

  • Computational Phylogenetics: An introduction to designing methods for phylogeny estimation (in press, Cambridge University Press)

Selected Articles in Journals

  • P. Vachaspati and T. Warnow (2016). FastRFS: Fast and Accurate Robinson-Foulds Supertrees using Constrained Exact Optimization Bioinformatics 2016; doi: 10.1093/bioinformatics/btw600. (Special issue for papers from RECOMB-CG)
  • L. Uricchio, T. Warnow, and N. Rosenberg (2016). An analytical upper bound on the number of loci required for all splits of a species tree to appear in a set of gene trees. BMC Bioinformatics, 17 (Suppl 14): 1266, special issue for RECOMB-CG.
  • M. Nute and T. Warnow (2016). Scaling statistical multiple sequence alignment to large datasets. BMC Genomics 17(Suppl 10): 764, special issue for RECOMB-CG.
  • N. Nguyen, M. Nute, S. Mirarab, and T. Warnow (2016). HIPPI: Highly accurate protein family classification with ensembles of HMMs. BMC Genomics 17 (Suppl 10):765, special issue for RECOMB-CG.
  • T. Hansen, S. Mollerup, N. Nguyen, L. Vinner, N. White, M. Coghlan, D. Alquezar-Planas, T. Joshi, R. Jensen, H. Fridholm, K. Kjaransdottir, T. Mourier, T. Warnow, G. Belsham, T. Gilbert, L. Orlando, M. Bunce, E. Willerslev, L. Nielsen, and A. Hansen (2016). High diversity of picornaviruses in rats from different continents revealed by deep sequencing, Emerging Microbes & Infections 5, e90, doi:doi:10.1038/emi.2016.90.
  • N. Nguyen, T. Warnow, M. Pop, and B. White. A perspective on 16S rRNA operational taxonomic unit clustering using sequence similarity. npj Biofilms and Microbiomes 2, Article number 16004 (2016), doi:10.1038/npjbiofilms.2016.4.
  • J.E. Tarver, M. d. Reis, S. Mirarab, R. J. Moran, S. Parker, J.E. O'Reilly, B.L. King, M.J. O'Connell, R.J. Asher, T. Warnow, K. J. Peterson, P.C.J. Donoghue, and D. Pisani. The interrelationships of placental mammals and the limits of phylogenetic inference. Genome Biology and Evolution, doi:10.1093/gbe/evv261.
  • S. Mirarab, Md. S. Bayzid, B. Boussau, and T. Warnow. Response to Comment on "Statistical binning enables an accurate coalescent-based estimation of the avian tree". Science, 2015, volume 350, number 6257, p. 171, DOI: 10.1126/science.aaa7719.
  • P. Vachaspati and T. Warnow. ASTRID: Accurate Species TRees from Internode Distances. BMC Genomics, 2015, 16 (Suppl 10): S3.
  • J. Chou, A. Gupta, S. Yaduvanshi, R. Davidson, M. Nute, S. Mirarab and T. Warnow. A comparative study of SVDquartets and other coalescent-based species tree estimation methods. BMC Genomics, 2015., 2015, 16 (Suppl 10): S2.
  • R. Davidson, P. Vachaspati, S. Mirarab, and T. Warnow. Phylogenomic species tree estimation in the presence of incomplete lineage sorting and horizontal gene transfer.  BMC Genomics, 2015, Preliminary version at (PDF), 2015, 16 (Suppl 10): S1.
  • T. Warnow. "Concatenation analyses in the presence of incomplete lineage sorting", PLOS Currents: Tree of Life 2015 May 22. Edition 1. doi: 10.1371/currents.tol.8d41ac0f13d1abedf4c4a59f5d17b1f7 (HTML)
  • S. Roch and T. Warnow. "On the robustness to gene tree estimation error (or lack thereof) of coalescent-based species tree methods", Systematic Biology, 64(4):663-676, 2015, (PDF)
  • S. Mirarab and T. Warnow. "ASTRAL-II: coalescent-based species tree estimation with many hundreds of taxa and thousands of genes", Proceedings Intelligent Systems for Molecular Biology (ISMB) 2015, and Bioinformatics 2015 31 (12): i44-i52 doi: 10.1093/bioinformatics/btv234
  • M. S. Bayzid, S. Mirarab, B. Boussau, and T. Warnow. "Weighted Statistical Binning: enabling statistically consistent genome-scale phylogenetic analyses", PLOS One, 2015, DOI: 10.1371/journal.pone.0129183.
  • N. Nguyen, S. Mirarab, K. Kumar, and T. Warnow, "Ultra-large alignments using phylogeny aware profiles". Genome Biology (2015) 16:124 doi: 10.1186/s13059-015-0688-z
  • E. D. Jarvis, S. Mirarab, A. J. Aberer, B. Li, P. Houde, C. Li, S. Y. W. Ho, B. C. Faircloth, B. Nabholz, J. T. Howard, A. Suh, C. C. Weber, R. R. da Fonseca, J. Li, F. Zhang, H. Li, L. Zhou, N. Narula, L. Liu, G. Ganapathy, B. Boussau, Md. S. Bayzid, V. Zavidovych, S. Subramanian, T. Gabaldon, S. Capella-Gutierrez, J. Huerta-Cepas, B. Rekepalli, K. Munch, M. Schierup, B. Lindow, W. C. Warren, D. Ray, R. E. Green, M. W. Bruford, X. Zhan, A. Dixon, S. Li, N. Li, Y. Huang, E. P. Derryberry, M. F. Bertelsen, F. H. Sheldon, R. T. Brumfield, C. V. Mello, P. V. Lovell, M. Wirthlin, M. P. C. Schneider, F. Prosdocimi, J. A. Samaniego, A. M. V. Velazquez, A. Alfaro-Nunez, P. F. Campos, B. Petersen, T. Sicheritz-Ponten, A. Pas, T. Bailey, P. Scofield, M. Bunce, D. M. Lambert, Q. Zhou, P. Perelman, A. C. Driskell, B. Shapiro, Z. Xiong, Y. Zeng, S. Liu, Z. Li, B. Liu, K. Wu, J. Xiao, X. Yinqi, Q. Zheng, Y. Zhang, H. Yang, J. Wang, L. Smeds, F. E. Rheindt, M. Braun, J. Fjeldsa, L. Orlando, F. K. Barker, K. A. Jonsson, W. Johnson, K.-P. Koepfli, S. O'Brien, D. Haussler, O. A. Ryder, C. Rahbek, E. Willerslev, G. R. Graves, T. C. Glenn, J. McCormack, D. Burt, H. Ellegren, P. Alstrom, S. V. Edwards, A. Stamatakis, D. P. Mindell, J. Cracraft, E. L. Braun, T. Warnow, W. Jun, M. T. P. Gilbert, and G. Zhang. "Whole-genome analyses resolve early branches in the tree of life of modern birds." Science 12 December 2014: 1320-1331
  • S. Mirarab, Md. S. Bayzid, B. Boussau, and T. Warnow. "Statistical binning enables an accurate coalescent-based estimation of the avian tree". Science, 12 December 2014: 1250463. Science
  • S. Mirarab, N. Nguyen, S. Guo, L.-S. Wang, J. Kim, and T. Warnow ``PASTA: Ultra-Large Multiple Sequence Alignment for Nucleotide and Amino-Acid Sequences." Journal of Computational Biology. December 2014 (ahead of print)
  • N. Nguyen, S. Mirarab, B. Liu, M. Pop, and T. Warnow "TIPP:Taxonomic Identification and Phylogenetic Profiling." Bioinformatics, 2014; doi: 10.1093/bioinformatics/btu721
  • N. Wickett, S. Mirarab, N. Nguyen, T. Warnow, et al. (37 authors). ``Phylotranscriptomic analysis of the origin and diversification of land plants." Proceedings of the National Academy of Sciences (PNAS), doi: 10.1073/pnas.1323926111
  • S. Mirarab, Md S. Bayzid, and T. Warnow. "Evaluating summary methods for multi-locus species tree estimation in the presence of incomplete lineage sorting". Systematic Biology, doi =10.1093/sysbio/syu063
  • T. Zimmermann, S. Mirarab and T. Warnow. "BBCA: Improving the scalability of *BEAST using random binning". BMC Genomics 2014, 15(Suppl 6): S11
  • Md S. Bayzid, T. Hunt, and T. Warnow. "Disk Covering Methods Improve Phylogenomic Analyses". BMC Genomics 2014, 15(Suppl 6): S7
  • S. Mirarab, R. Reaz, Md. S. Bayzid, T. Zimmermann, M.S. Swenson, and T. Warnow. "ASTRAL: Genome-Scale Coalescent-Based Species Tree Estimation." Bioinformatics 2014 30 (17): i541-i548. doi: 10.1093/bioinformatics/btu462.

Journal Editorships

  • Associate Editor, Journal Graph Algorithms and Applications

Conferences Organized or Chaired

  • Co-organizer, Next Generation Sequencing ￿ Algorithms, and Software For Biomedical Applications, Dagstuhl Seminar, August 28 to September 2, 2016

Service on Department Committees

  • Area Chair, Bioinformatics and Computational Biology group, CS department, 2015-2017

Honors

  • Fellow of the International Society for Computational Biology (2017)
  • Fellow of the ACM (Association for Computing Machinery) (2016)
  • John Simon Guggenheim Memorial Fellowship (2010)
  • Radcliffe Institute for Advanced Study (2003)
  • David and Lucile Packard Foundation Fellowship in Science and Engineering (1996)
  • National Science Foundation Young Investigator Award (1994)