TreeRipper web application: Towards a fully automated optical tree recognition software

8Citations
Citations of this article
43Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Background: Relationships between species, genes and genomes have been printed as trees for over a century. Whilst this may have been the best format for exchanging and sharing phylogenetic hypotheses during the 20thcentury, the worldwide web now provides faster and automated ways of transferring and sharing phylogenetic knowledge. However, novel software is needed to defrost these published phylogenies for the 21stcentury.Results: TreeRipper is a simple website for the fully-automated recognition of multifurcating phylogenetic trees (http://linnaeus.zoology.gla.ac.uk/~jhughes/treeripper/). The program accepts a range of input image formats (PNG, JPG/JPEG or GIF). The underlying command line c++ program follows a number of cleaning steps to detect lines, remove node labels, patch-up broken lines and corners and detect line edges. The edge contour is then determined to detect the branch length, tip label positions and the topology of the tree. Optical Character Recognition (OCR) is used to convert the tip labels into text with the freely available tesseract-ocr software. 32% of images meeting the prerequisites for TreeRipper were successfully recognised, the largest tree had 115 leaves.Conclusions: Despite the diversity of ways phylogenies have been illustrated making the design of a fully automated tree recognition software difficult, TreeRipper is a step towards automating the digitization of past phylogenies. We also provide a dataset of 100 tree images and associated tree files for training and/or benchmarking future software. TreeRipper is an open source project licensed under the GNU General Public Licence v3. © 2011 Hughes; licensee BioMed Central Ltd.

Cite

CITATION STYLE

APA

Hughes, J. (2011). TreeRipper web application: Towards a fully automated optical tree recognition software. BMC Bioinformatics, 12. https://doi.org/10.1186/1471-2105-12-178

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free