Abstract
We systematically study the computational complexity of a broad class of computational problems in phylogenetic reconstruction. The class contains for example the rooted triple consistency problem, forbidden subtree problems, the quartet consistency problem, and many other problems studied in the bioinformatics literature. The studied problems can be described as constraint satisfaction problems where the constraints have a firstorder definition over the rooted triple relation. We show that every such phylogeny problem can be solved in polynomial time or is NPcomplete. On the algorithmic side, we generalize a wellknown polynomialtime algorithm of Aho, Sagiv, Szymanski, and Ullman for the rooted triple consistency problem. Our algorithm repeatedly solves linear equation systems to construct a solution in polynomial time. We then show that every phylogeny problem that cannot be solved by our algorithm is NPcomplete. Our classification establishes a dichotomy for a large class of infinite structures that we believe is of independent interest in universal algebra, model theory, and topology. The proof of our main result combines results and techniques from various research areas: a recent classification of the modelcomplete cores of the reducts of the homogeneous binary branching Crelation, Leebâ€™s Ramsey theorem for rooted trees, and universal algebra.
