How to use phyloT
phyloT automatically generates phylogenetic trees based on the full NCBI taxonomy. After parsing the complete NCBI taxonomy database, phyloT will generate a pruned tree in the selected format, based on the tree elements you provide. Tree elements can by typed/pasted into the provided input box, or uploaded in a plain text file. You can use any combination of the following element types, separated using commas or newlines:
- NCBI taxonomy IDs: they will appear as leaves in the generated tree, regardless of their taxonomic class. Classes can be mixed freely, ie. one ID can represent a species and another a phylum
- NCBI scientific names: make sure you provide the proper NCBI scientific name (including proper capitalization). For example, human should be specified as "Homo sapiens". Othewise, they function exactly the same as NCBI taxonomy IDs. Please note that we exclude names containing special characters from our database, please use numeric taxonomy IDs instead. You can also use the taxonomy names search box, described below.
- UniProt species identification codes: Codes should be in captial letters, exactly as listed in UniProt speclist.txt. They will be replaced by the corresponding NCBI taxonomy ID or scientific name.
- UniProt protein IDs or ACCs: proteins will appear as leaves in the generated tree, while the clades will be NCBI taxonomy IDs. If multiple IDs map to the same species, they will all be grouped into a multifurcating clade.
- NCBI GenBank nucleotide ACCs: they function the same as Uniprot IDs
Click any of the 'Example' buttons below the main input field to see various example tree element combinations.
Searching for a NCBI scientific name
Use the 'Search taxonomy' box to quickly search the full database of NCBI taxonomy names. Each result will show the full scientific name with its taxonomic class and NCBI taxonomy ID. Clicking on any result will append its tax ID to the current tree elements.
Including full clades
phyloT supports generation of full clade trees for any NCBI taxonomy ID or scientific name. To include the complete sub-tree for a NCBI taxonomy ID (or scientific name), simply append a vertical line followed by the keyword "subtree" to the input element.
The example input above would generate a tree containing 2 clades: Mammalia and Insecta (NCBI Tax ID 7147). All NCBI nodes belonging to these clades would be included.
In the example above, the final tree would contain ~33 000 nodes, making it hard to manipulate or visualize. phyloT offers two mechanisms for filtering nodes from generated trees:
Interrupting at a specific class
Using the "Interrupt at" selector, you can specify a taxonomic class where the tree generation will be stopped. For example, if "Genus" is selected, leaves of the included clades will correspond to nodes with taxonomic class genus. Note that this applies only to full clades (ie. nodes appended with keyword "|subtree"). You can still include additional elements corresponding to "higher" taxonomic classes, and these will be present in the tree. For example:
Mammalia|subtree 7147|subtree Escherichia_coli
In the example above, with "Interrupt at" set to Genus, Escherichia coli would still appear as a regular leaf in the tree, even though its class is species (ie. 'higher' than genus).
Removing nodes matching a text pattern
You can remove all nodes whose scientific names match a certain pattern. Simply type the words and phrases which should be filtered into the Filtering field, separated with commas.
In the example above, full Ascomycota phylum tree would contain ~55 000 nodes. However, if various unclassified species and environmental samples are removed, only ~3500 nodes remain. This can be accomplished by entering, for example, "environmental_sample,unclassified" into the Filtering field.
When setting the identifier format to "NCBI Taxonomy IDs", all internal nodes of the tree will be prefixed with keyword "INT" (for example, node 7147 will be labeled as INT7147). This is done to prevent various tree parsers from misidentifying these IDs as tree support values (bootstraps).
Collapsing internal nodes
Due to the nature of the underlying NCBI taxonomy data and depending on the provided tree elements, generated trees will often have many internal nodes which have only one child. phyloT therefore offers an option to remove such nodes, by setting the "Internal nodes" option to "Collapsed".
Forcing creation of binary trees
Many nodes in NCBI taxonomy are highly polytomic (have many child branches). If your tree visualization or analysis software requires a strictly binary tree, where each node must have exactly two children, you can set the "Polytomy" option to "No". phyloT will then randomly combine multiforcating nodes into separate bifurcating structures and introduce additional ("fake") internal nodes as required, producing a proper binary tree.
Tree format and file name
In addition to the commonly used Newick format, you can also download the generated trees in NEXUS or phyloXML formats. If a file name is not provided when generating a tree, a randomly generated string will be used.
Visualizing generated trees in iTOL
If you only want to visualize the generated tree, phyloT offers a direct link to iTOL: interactive Tree Of Life. Simply click the "Visualize in iTOL" button. iTOL is an online phylogenetic tree visualization tool, offering powerful annotation features. Check the iTOL website for more details.
phyloT tree generation engine is also available through iTOL's batch access API, allowing simple programmatic access. Check the iTOL batch access help for more details.