phylot help

phyloT automatically generates phylogenetic trees based on the NCBI taxonomy or the Genome Taxonomy Database (GTDb). NCBI taxonomy attempts to incorporate phylogenetic and taxonomic knowledge from a variety of sources, and phyloT generated trees which use NCBI as a source simply represent the current taxonomic structure of the NCBI taxonomy database. They are not "proper" phylogenetic trees, and do not contain branch lengths or clade support values. On the other hand, phyloT trees which use GTDb as a source are pruned versions of the GTDb reference trees (either Bacterial or Archaeal), and therefore contain branch lengths and clade support values.

Please check the data source web sites for detailed information on their methodology:

How to use phyloT

After parsing the complete NCBI taxonomy or GTDb, phyloT will generate a pruned tree in the selected format, based on the tree elements you provide. Tree elements can by typed or pasted into the provided input box ("Tree elements"), or uploaded in a plain text file. Depending on the taxonomy source selected, you can use any combination of the following element types, separated by commas or newlines:

NCBI Taxonomy:

Genome Taxonomy Database:

Click any of the Example buttons below the main input field to see various example tree element combinations.

Searching for a taxonomic name

Use the 'Search taxonomy' box to quickly search the full database of NCBI taxonomy or GTDb names. Each result will show the full scientific name with its taxonomic class and taxonomy ID. Clicking on any result will append its tax ID to the current tree elements.

Including full clades

phyloT supports generation of full clade trees for any taxonomy ID or scientific name. To include the complete sub-tree for a taxonomy ID (or scientific name), simply append a vertical line followed by the keyword "subtree" to the input element.

For example:

Mammalia|subtree
7147|subtree

The example input above would generate a tree containing 2 clades: Mammalia and Insecta (NCBI Tax ID 7147). All NCBI nodes belonging to these clades would be included.

Filtering nodes

In the example above, the final tree would contain ~33 000 nodes, making it hard to manipulate or visualize. phyloT offers two mechanisms for filtering nodes from generated trees:

  1. Interrupting at a specific class

    Using the "Interrupt at" selector, you can specify a taxonomic class where the tree generation will be stopped. For example, if "Genus" is selected, leaves of the included clades will correspond to nodes with taxonomic class genus. Note that this applies only to full clades (ie. nodes appended with keyword "|subtree"). You can still include additional elements corresponding to "higher" taxonomic classes, and these will be present in the tree.

    For example:

    Mammalia|subtree
    7147|subtree
    Escherichia_coli
    

    In the example above, with "Interrupt at" set to Genus, Escherichia coli would still appear as a regular leaf in the tree, even though its class is species (ie. 'higher' than genus).

  2. Removing nodes matching a text pattern

    You can remove all nodes whose scientific names match a certain pattern. Simply type the words and phrases which should be filtered into the Filtering field, separated with commas.

    For example:

    Ascomycota|subtree
    

    In the example above, full Ascomycota phylum tree would contain ~55 000 nodes. However, if various unclassified species and environmental samples are removed, only ~3500 nodes remain. This can be accomplished by entering, for example, "environmental_sample,unclassified" into the Filtering field.

Genome Taxonomy Database specific options

GTD taxonomy covers Bacteria and Archaea only, and phyloT uses their respective reference trees when creating a pruned version. Since these trees are independent, you have to specify which one to use, by selecting the correct entry under Source taxonomy.

Branch lengths and node support values

As opposed to NCBI taxonomy, the GTD trees are proper phylogenetic reference trees, containing branch lengths and clade support values (bootstraps). If you want to include these values in the phyloT generated tree, set the Support/BRL option to Yes.

Including genome IDs

If you use genome IDs (RefSeq or GeneBank accession numbers) to create the tree, these will normally be mapped to their corresponding species name (which will be shown as a leaf in the tree). If you select the option to include the genome IDs in the tree, these will be added as leaves, while their parent node will be the species name.

Output options

Node identifiers (only for NCBI taxonomy)

Nodes of the generated tree can be represented by four identifier types:

When setting the identifier format to "NCBI Taxonomy IDs", all internal nodes of the tree will be prefixed with keyword "INT" (for example, node 7147 will be labeled as INT7147). This is done to prevent various tree parsers from misidentifying these IDs as tree support values (bootstraps).

Collapsing internal nodes (only for NCBI taxonomy)

Due to the nature of the underlying NCBI taxonomy data and depending on the provided tree elements, generated trees will often have many internal nodes which have only one child. phyloT therefore offers an option to remove such nodes, by setting the "Internal nodes" option to "Collapsed".

Forcing creation of binary trees (only for NCBI taxonomy)

Many nodes in NCBI taxonomy are highly polytomic (have many child branches). If your tree visualization or analysis software requires a strictly binary tree, where each node must have exactly two children, you can set the "Polytomy" option to "No". phyloT will then randomly combine multifurcating nodes into separate bifurcating structures and introduce additional ("fake") internal nodes as required, producing a proper binary tree.

Tree format and file name

In addition to the commonly used Newick format, you can also download the generated trees in NEXUS or phyloXML formats. If a file name is not provided when generating a tree, a randomly generated string will be used.

Visualizing generated trees in iTOL

If you only want to visualize the generated tree, phyloT offers a direct link to iTOL: interactive Tree Of Life. Simply click the "Visualize in iTOL" button. iTOL is an online phylogenetic tree visualization tool, offering powerful annotation features. Check the iTOL website for more details.

User account

Trees created from more than 10 elements require either an active phyloT subscription, institutional access or a tree generation token. Please create a phyloT account first, and then visit your personal info page to view all available options.

Tree generation tokens:

If you don't use phyloT often, tree generation tokens are the simplest solution. They do not expire, and can be purchased directly, or simply obtained by using our sponsor's adverts. Each generated tree requires one token, but you can freely change the phyloT tree options and file format, as long as the tree elements remain the same. Each tree (defined by the tree elements used to generate it) remains freely accessible for 6 months. You can also share our sponsor advert links with your friends. Any tokens generated through those links will be credited to your phyloT account.

Subscription:

phyloT offers monthly or annual subscriptions for unrestricted access. Click the 'Start/extend subscription' button on your personal info page to display the available options. Note that all phyloT subscriptions are non-recurring and will never be extended automatically. Once your subscription expires, you will have to manually reactivate it.

Institutional access:

If your institution currently has a valid phyloT license, you either have direct unrestricted access to phyloT via your IP address, or you were given a phyloT access key. To use a license key, click on the 'Provide access key' button on your personal info page to activate your account. If you are interested in this mode of access, please have your librarian or IT department to obtain a license.