Security News
Fluent Assertions Faces Backlash After Abandoning Open Source Licensing
Fluent Assertions is facing backlash after dropping the Apache license for a commercial model, leaving users blindsided and questioning contributor rights.
Using classes and methods in phylotreelib.py it is possible to read and write treefiles and to analyze and manipulate the trees in various ways.
The phylotreelib.py module is available on GitHub: https://github.com/agormp/phylotreelib and on PyPI: https://pypi.org/project/phylotreelib/
python3 -m pip install phylotreelib
Upgrading to latest version:
python3 -m pip install --upgrade phylotreelib
To cite phylotreelib: use the link in the right sidebar under About --> Cite this repository.
The code below will import phylotreelib, open a NEXUS tree file, read one Tree object from the file, perform minimum-variance rooting, find the node ID for the new rootnode, and finally print out the name and root-to-tip distance (measured along the branches) for all leaves in the tree. (Note that pt.Nexustreefile has been implemented as a context manager, and can be used with the with
statement):
import phylotreelib as pt
with pt.Nexustreefile("mytreefile.nexus") as treefile:
mytree = treefile.readtree()
mytree.rootminvar()
rootnode = mytree.root
for tip in mytree.leaves:
dist = mytree.nodedist(rootnode, tip)
print(f"{tip:<10s} \t {dist:.2f}")
Output:
nitrificans 1879.84
Is79A3 1878.95
GWW4 1877.47
.
.
.
A2 1879.84
communis 1878.95
The code below constructs a Tree object from a Newick formatted string and then prints the string representation of the tree (using the Tree object's __str__() method).
import phylotreelib as pt
mytree = pt.Tree.from_string("(Gorilla:3, (Human:2, (Chimpanzee:1, Bonobo:1):1):1);")
print(mytree)
Output:
|----------------------------------------------|
| Node | Child | Distance | Label |
|----------------------------------------------|
| 0 | 1 | 1 | |
| 0 | Gorilla | 3 | |
| 1 | 2 | 1 | |
| 1 | Human | 2 | |
| 2 | Bonobo | 1 | |
| 2 | Chimpanzee | 1 | |
|----------------------------------------------|
4 Leafs:
----------
Bonobo
Chimpanzee
Gorilla
Human
The code below constructs a Treesummary object, opens a Nexus-formatted file with multiple trees, and then extracts all Tree objects from the file by iterating over the file while adding the trees to the Treesummary object. After this a majority rule consensus tree is computed from the Treesummary object, the tree is midpoint rooted, and the resulting tree is finally written in Newick format to the output file "contree.newick"
import phylotreelib as pt
treesummary = pt.TreeSummary()
with pt.Nexustreefile("BEAST_samples.trees") as beastfile:
for tree in beastfile:
treesummary.add_tree(tree)
consensus_tree = treesummary.contree()
consensus_tree.rootmid()
with open("contree.newick", "w") as outfile:
outfile.write(consensus_tree.newick())
The code below opens a Nexus format file, reads one Tree object from the file, prunes the tree such that 50 leaves remain, and writes the resulting tree, in nexus format, to a new file. The leaves are chosen such that they are maximally representative in the sense that they spread out the maximum possible percentage of the original tree length (i.e., there is no other subset of 50 leaves that would result in a tree with a larger sum of branch lenths). This can be used e.g. for reducing the size of a tree prior to computationally costly downstream analyses, or to simplify visualization (especially useful if there are many closely related leaves).
import phylotreelib as pt
with pt.Nexustreefile("SARSCoV2_all.tree") as treefile:
bigtree = treefile.readtree()
smalltree = bigtree.prune_maxlen(nkeep=50)
with open("SARSCoV2_50.tree", "w") as outfile:
outfile.write(smalltree.nexus())
The code below opens a Newick file, reads one Tree object from the file, and then finds the 5 leaves that are closest (measured along the branches) to the leaf labeled "nitrificans".
import phylotreelib as pt
with pt.Newicktreefile("Comammox.newick") as treefile:
tree = treefile.readtree()
print(tree.nearest_n_leaves("nitrificans", 5))
Output:
{'A2', 'AAUMBR1', 'CG24B', 'inopinata', 'nitrosa'}
The code below opens a fasta file containing a set of aligned DNA sequences and reads the aligned sequences (using classes and methods from the sequencelib library), constructs a nested dictionary containing all pairwise sequence distances, constructs a Distmatrix object from this dictionary, and computes a neighbor joining tree from the distance matrix.
import phylotreelib as pt
import sequencelib as seqlib
seqfile = seqlib.Seqfile("myalignment.fasta")
seqs = seqfile.read_alignment()
distdict = seqs.distdict()
dmat = pt.Distmatrix.from_distdict(distdict)
mytree = dmat.nj()
The code below constructs a random, bifurcating tree with 50 tips and random branch lengths, creates a copy of that tree, performs 5 random Subtree Pruning and Regrafting (SPR) moves, and finally computes three different measures of tree-distance between the original tree and the SPR-transformed tree:
import phylotreelib as pt
tree1 = pt.Tree.randtree(ntips=50, randomlen=True)
tree2 = tree1.copy_treeobject()
for i in range(5):
tree2.spr()
rf = tree1.treedist_RF(tree2)
rfnorm = tree1.treedist_RF(tree2, normalise=True)
rfsimnorm = 1 - rfnorm
pd = tree1.treedist_pathdiff(tree2)
print(f"Robinson-Foulds distance: {rf}")
print(f"Normalised similarity (based on RF distance): {rfsimnorm:.2f}")
print(f"Path difference distance: {pd:.2f}")
The code below creates a TreeSummary object that will keep track of clades and topologies in the input tree-set, opens a file containing tree samples from a BEAST run, discards the first 500 as burnin, adds the remaining trees to the TreeSummary object, computes a maximum clade credibility (MCC) tree from the tree-summary, sets the branch lengths based on the common ancestor heights (based on original tree file), and writes the result to a nexus file (branch labels will correspond to clade credibility values).
NOTE: The command-line program sumt exposes all phylotreelib's functionality related to consensus trees, and allows the user to create consensus, MCC, and MBC trees with various options for branch lengths and rooting, without having to write scripts.
import phylotreelib as pt
treesummary = pt.BigTreeSummary(trackbips=False, trackclades=True, trackroot=True)
treefile = pt.Nexustreefile("SARS-CoV-2.trees")
burnin = 500
for i in range(burnin):
treefile.readtree(returntree=False)
for tree in treefile:
treesummary.add_tree(tree)
mcctree = treesummary.max_clade_cred_tree()
weight = 1.0
treecount = 2001
mcctree = set_ca_node_depths(mcctree, [weight, treecount, burnin, "SARS-CoV-2.trees"])
with open("SARS-CoV-2.mcc", "w") as outfile:
outfile.write(mcctree.nexus())
Typically, phylotreelib will be used for analysing (or manipulating) one or more trees that have been read from a textfile in Newick or Nexus format. Reading a tree from file will return a Tree object, which has methods for interrogating or altering itself (e.g. mytree.rootmid()
will midpoint root the Tree object mytree
).
To open a Newick format file:
treefile = phylotreelib.Newicktreefile(filename)
To open a Nexus format file:
treefile = phylotreelib.Nexustreefile(filename)
These commands will return a file object with methods for reading the contents of the file.
The classes phylotreelib.Newicktreefile and phylotreelib.Nexustreefile have been implemented as context managers, so it is possible to use them with the with
statement:
with phylotreelib.Nexustreefile(filename) as treefile:
<read trees and do other stuff with treefile>
To read one tree from an opened treefile:
tree = treefile.readtree()
This returns a Tree object, which has methods for analysing and manipulating the tree (itself). By calling readtree() repeatedly, you can read additional trees from the file. The readtree()
method returns None
when all trees have been read.
The readtrees()
(plural) method returns all the trees in the file in the form of a Treeset object. Treeset objects contains a list of Tree objects and has methods for rooting and outputting all trees in the collection. Iterating over a Treeset object returns Tree objects.
treeset = treefile.readtrees()
It is also possible to retrieve all the trees from an open treefile one at a time by iterating directly over the file (useful for minimizing memory consumptiom when handling files with many trees):
for tree in treefile:
<do something with tree>
As mentioned, phylotreelib.Newicktreefile and phylotreelib.Nexustreefile have been implemented as context managers, and it is therefore possible (and probably safer) to read trees using the with
statement:
with phylotreelib.Nexustreefile(filename) as treefile:
tree = treefile.readtree()
Instead of reading a tree from a file, you can also construct Tree objects using one of the several alternative constructors in the Tree class.
Tree objects can be constructed directly from a string (where the string is a Newick formatted tree):
tree = phylotreelib.Tree.from_string(mystring)
Tree objects (with a star topology) can be constructed from a list of leaf names:
tree = phylotreelib.Tree.from_leaves(leaflist)
The constructor Tree.from_branchinfo() constructs a tree from information about all individual branches in the tree. Specifically the input is a list of parent node IDs and a list of child node IDs (of the same length), such that each pairing of parentnode and childnode corresponds to a branch in the tree. It is possible to add extra lists containing the corresponding branch lengths and branch labels. Using this constructor allows the specific naming of internal nodes (which are otherwise set automatically based on e.g. the order in which a newick string is parsed). NOTE: internal node IDs have to be integers, while leaf IDs have to be strings.
parentlist = [100, 100, 101, 101, 102, 102]
childlist = [101, 102, "A", "B", "C", "D"]
blenlist = [1,1,2,3,2,3]
tree = phylotreelib.Tree.from_branchinfo(parentlist,childlist,blenlist)
would result in this tree:
print(tree)
|-----------------------------------------|
| Node | Child | Distance | Label |
|-----------------------------------------|
| 100 | 101 | 1 | |
| 100 | 102 | 1 | |
| 101 | A | 2 | |
| 101 | B | 3 | |
| 102 | C | 2 | |
| 102 | D | 3 | |
|-----------------------------------------|
4 Leafs:
-----
A
B
C
D
It is possible to construct Tree objects with random tree topology using the randtree constructor:
tree = phylotreelib.Tree.randtree(ntips=35, randomlen=True, name_prefix="s"):
Either a list of names (leaflist
) or the number of tips (ntips
) can be specified as a way of setting the size of the tree. If the function argument randomlen
is True then branches will get random lengths drawn from a lognormal distribution.
Tree objects consist of external nodes (leaves), which are identified by strings (e.g. "Chimpanzee"), and internal nodes, which are identified by integers (e.g., 5). Branches between nodes may have a label (string) and/or a branch length (float) associated with them.
A textual representation of a Tree object can be obtained using: "print(mytree)" (where mytree is the name of the Tree object). The resulting output is a child-list representation of the tree followed by an alphabetical list of leafs, along these lines:
>>> print(mytree)
|-------------------------------------------------|
| Node | Child | Distance | Label |
|-------------------------------------------------|
| 0 | lmo0024 | 0.0246752 | |
| 0 | 1 | 0.972917 | 1.00 |
| 0 | lin0023 | 0.0627209 | |
| 1 | SMU_1957 | 1.02145 | |
| 1 | 2 | 0.219277 | 0.83 |
| 2 | 17 | 0.234726 | 0.98 |
| 2 | 3 | 0.089033 | 0.81 |
| 17 | CPE0323 | 0.595102 | |
| 17 | CPE2630 | 0.889163 | |
| 3 | 4 | 0.246187 | 0.86 |
| 3 | 14 | 0.284347 | 1.00 |
| 4 | 5 | 0.124892 | 0.99 |
| 4 | 7 | 0.285426 | 0.85 |
| 14 | 16 | 0.862141 | 1.00 |
| 14 | 15 | 0.458832 | 1.00 |
| 5 | Bsu_COG3716 | 0.574712 | |
| 5 | 6 | 0.276761 | 1.00 |
| 7 | 8 | 0.0462592 | 0.55 |
| 7 | SMU_1879 | 0.22478 | |
| 7 | Lme1 | 0.29347 | |
| 16 | Sag | 0.300177 | |
| 16 | CPE1463 | 0.178835 | |
| 15 | lmo2000 | 0.10323 | |
| 15 | lin2108 | 0.0582161 | |
| 6 | lmo0781 | 0.0465803 | |
| 6 | lin0774 | 0.0163573 | |
| 8 | 9 | 0.0570041 | 0.82 |
| 8 | Lls_ | 0.318751 | |
| 9 | 10 | 0.0290423 | 0.70 |
| 9 | LJ0505 | 0.256245 | |
| 10 | Lme2 | 1.73392 | |
| 10 | 11 | 0.0534659 | 0.54 |
| 10 | Lpl | 0.183936 | |
| 10 | 13 | 0.107465 | 0.99 |
| 11 | EF0022 | 0.0951663 | |
| 11 | 12 | 0.122973 | 0.92 |
| 13 | CPE0823 | 0.176274 | |
| 13 | CAC | 0.113429 | |
| 12 | lin0145 | 0.00949095 | |
| 12 | lmo0098 | 0.0119641 | |
|-------------------------------------------------|
23 Leafs:
-----------
Bsu_COG3716
CAC
CPE0323
CPE0823
CPE1463
CPE2630
EF0022
LJ0505
Lls_
Lme1
Lme2
Lpl
SMU_1879
SMU_1957
Sag
lin0023
lin0145
lin0774
lin2108
lmo0024
lmo0098
lmo0781
lmo2000
Tree objects have a number of attributes that contain information regarding the tree.
One example of using a tree attribute is (assuming we have a Tree object named "tree"):
taxa = tree.leaves
List of useful Tree object attributes:
.leaves
: Set of leaf names.intnodes
: Set of internal node IDs.nodes
: Set of all nodes (= leaves + internal nodes).root
: ID for root node (usually 0, but may change if re-rooted)Tree objects also have a number of methods that can be used to analyze and alter them.
One example of using a Tree object method is:
childnodes = tree.children(7)
which returns a set containing the node-IDs for the immediate descendants of node 7 (the nodes directly connected to node 7).
A full list of classes and methods in phylotreelib is at the end of this README
phylotreelib has its own error class ("TreeError"), which can be used for raising and catching tree-related errors in your own program and dealing with errors intelligently:
Example usage:
try:
tree.rootout(outgroup)
except phylotreelib.TreeError as err:
print("This error occurred: {}".format(err) )
Example usage 2:
if nodename not in tree.nodes:
raise TreeError("Tree contains no leafs named {}".format(nodename))
Help on module phylotreelib:
NAME
phylotreelib - Classes and methods for analyzing, manipulating, and building phylogenetic trees
CLASSES
builtins.Exception(builtins.BaseException)
TreeError
builtins.object
Branchstruct
Distmatrix
Globals
Interner
Topostruct
Tree
TreeSet
TreeSummary
BigTreeSummary
Treefile
TreefileBase
Newicktreefile
Nexustreefile
class BigTreeSummary(TreeSummary)
| BigTreeSummary(interner=None, store_trees=False)
|
| Class summarizing bipartitions, branch lengths, and topologies from many trees
|
| Method resolution order:
| BigTreeSummary
| TreeSummary
| builtins.object
|
| Methods defined here:
|
| __init__(self, interner=None, store_trees=False)
| TreeSummary constructor. Initializes relevant data structures
|
| add_tree(self, curtree, weight=1.0)
| Add tree to treesummary, update all summaries
|
| max_clade_cred_tree(self, labeldigits=3)
| Find and return maximum clade credibility tree
|
| update(self, other)
| Merge this object with other treesummary
|
| ----------------------------------------------------------------------
| Readonly properties defined here:
|
| toposummary
| Property method for lazy evaluation of topostruct.freq
|
| ----------------------------------------------------------------------
| Methods inherited from TreeSummary:
|
| __len__(self)
|
| add_branchid(self)
| Adds attribute .branchID to all bipartitions in .bipartsummary
| External bipartitions are labeled with the leafname.
| Internal bipartitions are labeled with consecutive numbers by decreasing frequency
|
| contree(self, cutoff=0.5, allcompat=False, labeldigits=3)
| Returns a consensus tree built from selected bipartitions
|
| log_clade_credibility(self, topology)
| Compute log clade credibility for topology (sum of log(freq) for all branches)
|
| ----------------------------------------------------------------------
| Readonly properties inherited from TreeSummary:
|
| bipartsummary
| Property method for lazy evaluation of freq, var, and sem for branches
|
| sorted_biplist
| Return list of bipartitions.
| First external (leaf) bipartitions sorted by leafname.
| Then internal bipartitions sorted by freq
|
| ----------------------------------------------------------------------
| Data descriptors inherited from TreeSummary:
|
| __dict__
| dictionary for instance variables (if defined)
|
| __weakref__
| list of weak references to the object (if defined)
class Branchstruct(builtins.object)
| Branchstruct(length=0.0, label='')
|
| Class that emulates a struct. Keeps branch-related info
|
| Methods defined here:
|
| __init__(self, length=0.0, label='')
| Initialize self. See help(type(self)) for accurate signature.
|
| copy(self)
| Returns copy of Branchstruct object, with all attributes included
|
| ----------------------------------------------------------------------
| Data descriptors defined here:
|
| __dict__
| dictionary for instance variables (if defined)
|
| __weakref__
| list of weak references to the object (if defined)
class Distmatrix(builtins.object)
| Class representing distance matrix for set of taxa. Knows how to compute trees
|
| Methods defined here:
|
| __init__(self)
| Initialize self. See help(type(self)) for accurate signature.
|
| __str__(self)
| Returns distance matrix as string
|
| avdist(self)
| Returns average dist in matrix (not including diagonal)
|
| clean_names(self, illegal=',:;()[]', rep='_')
| Rename items to avoid characters that are problematic in Newick tree strings:
| Replaces all occurrences of chars in 'illegal' by 'rep'
|
| getdist(self, name1, name2)
| Returns distance between named entries
|
| nj(self)
| Computes neighbor joining tree, returns Tree object
|
| rename(self, oldname, newname)
| Changes name of one item from oldname to newname
|
| setdist(self, name1, name2, dist)
| Sets distance between named entries
|
| upgma(self)
| Computes UPGMA tree, returns Tree object
|
| ----------------------------------------------------------------------
| Class methods defined here:
|
| from_distdict(distdict) from builtins.type
| Construct Distmatrix object from nested dictionary of dists: distdict[name1][name2] = dist
|
| from_distfile(distfilename) from builtins.type
| Construct Distmatrix object from file containing rows of: name1 name2 distance
|
| from_numpy_array(nparray, namelist) from builtins.type
| Construct Distmatrix object from numpy array and corresponding list of names
| Names in namelist must be in same order as indices in numpy 2D array
|
| ----------------------------------------------------------------------
| Data descriptors defined here:
|
| __dict__
| dictionary for instance variables (if defined)
|
| __weakref__
| list of weak references to the object (if defined)
class Globals(builtins.object)
| Class containing globally used functions and labels.
|
| Data descriptors defined here:
|
| __dict__
| dictionary for instance variables (if defined)
|
| __weakref__
| list of weak references to the object (if defined)
|
| ----------------------------------------------------------------------
| Data and other attributes defined here:
|
| biparts = {}
class Interner(builtins.object)
| Class used for interning various objects.
|
| Methods defined here:
|
| __init__(self)
| Initialize self. See help(type(self)) for accurate signature.
|
| intern_bipart(self, bipart)
|
| intern_leafset(self, leafset)
|
| intern_topology(self, topology)
|
| ----------------------------------------------------------------------
| Data descriptors defined here:
|
| __dict__
| dictionary for instance variables (if defined)
|
| __weakref__
| list of weak references to the object (if defined)
class Newicktreefile(TreefileBase)
| Newicktreefile(filename=None, filecontent=None)
|
| Class representing Newick tree file. Iteration returns tree-objects
|
| Method resolution order:
| Newicktreefile
| TreefileBase
| builtins.object
|
| Methods defined here:
|
| __init__(self, filename=None, filecontent=None)
| Initialize self. See help(type(self)) for accurate signature.
|
| __iter__(self)
|
| __next__(self)
|
| ----------------------------------------------------------------------
| Methods inherited from TreefileBase:
|
| __enter__(self)
| Implements context manager behaviour for TreefileBase types.
| Usage example:
| with pt.Newicktreefile(filename) as tf:
| mytree = tf.readtree()
| mytree.rootminvar()
|
| __exit__(self, type, value, traceback)
| Implements context manager behaviour for TreefileBase types.
| Usage example:
| with pt.Newicktreefile(filename) as tf:
| mytree = tf.readtree()
| mytree.rootminvar()
|
| close(self)
| For explicit closing of Treefile before content exhausted
|
| get_treestring(self)
| Return next tree-string
|
| readtree(self)
| Reads one tree from file and returns as Tree object. Returns None when exhausted file
|
| readtrees(self, discardprop=0.0)
| Reads trees from file and returns as TreeSet object. Can discard fraction of trees
|
| ----------------------------------------------------------------------
| Data descriptors inherited from TreefileBase:
|
| __dict__
| dictionary for instance variables (if defined)
|
| __weakref__
| list of weak references to the object (if defined)
class Nexustreefile(TreefileBase)
| Nexustreefile(filename=None, filecontent=None)
|
| Class representing Nexus tree file. Iteration returns tree object or None
|
| Method resolution order:
| Nexustreefile
| TreefileBase
| builtins.object
|
| Methods defined here:
|
| __init__(self, filename=None, filecontent=None)
| Read past NEXUS file header, parse translate block if present
|
| __iter__(self)
|
| __next__(self, noreturn=False)
|
| ----------------------------------------------------------------------
| Methods inherited from TreefileBase:
|
| __enter__(self)
| Implements context manager behaviour for TreefileBase types.
| Usage example:
| with pt.Newicktreefile(filename) as tf:
| mytree = tf.readtree()
| mytree.rootminvar()
|
| __exit__(self, type, value, traceback)
| Implements context manager behaviour for TreefileBase types.
| Usage example:
| with pt.Newicktreefile(filename) as tf:
| mytree = tf.readtree()
| mytree.rootminvar()
|
| close(self)
| For explicit closing of Treefile before content exhausted
|
| get_treestring(self)
| Return next tree-string
|
| readtree(self)
| Reads one tree from file and returns as Tree object. Returns None when exhausted file
|
| readtrees(self, discardprop=0.0)
| Reads trees from file and returns as TreeSet object. Can discard fraction of trees
|
| ----------------------------------------------------------------------
| Data descriptors inherited from TreefileBase:
|
| __dict__
| dictionary for instance variables (if defined)
|
| __weakref__
| list of weak references to the object (if defined)
class Topostruct(builtins.object)
| Class that emulates a struct. Keeps topology-related info
|
| Data descriptors defined here:
|
| freq
|
| tree
|
| weight
class Tree(builtins.object)
| Class representing basic phylogenetic tree object.
|
| Methods defined here:
|
| __eq__(self, other, blenprecision=0.005)
| Implements equality testing for Tree objects
|
| __hash__(self)
| Implements hashing for Tree objects, so they can be used as keys in dicts
|
| __init__(self)
| Initialize self. See help(type(self)) for accurate signature.
|
| __iter__(self)
| Returns iterator object for Tree object. Yields subtrees with .basalbranch attribute
|
| __str__(self)
| Prints table of parent-child relationships including branch lengths and labels
|
| add_branch(self, bipart, branchstruct)
| Adds branch represented by bipartition to unresolved tree.
|
| add_leaf(self, parent, newleafname, branchstruct)
| Adds new leaf to existing intnode ´parent´
|
| average_ancdist(self, leaflist, return_median=False)
| Return average or median patristic distance from leaves to their MRCA
|
| average_pairdist(self, leaflist, return_median=False)
| Return average or median pairwise, patristic distance between leaves in leaflist
|
| bipdict(self, interner=None)
| Returns tree in the form of a "bipartition dictionary"
|
| build_dist_dict(self)
| Construct dictionary keeping track of all pairwise distances between nodes
|
| build_parent_dict(self)
| Constructs _parent_dict enabling faster lookups, when needed
|
| build_path_dict(self)
| Construct dictionary keeping track of all pairwise paths between nodes
|
| check_bip_compatibility(self, bipart)
| Checks the compatibility between bipartition and tree.
| Returns tuple of: is_present, is_compatible, insert_tuple
| where insert_tuple = None or (parentnode, childmovelist)
| is_present:
| True if bipartition is already present in tree. Implies "is_compatible = True"
| is_compatible:
| True if bipartition is compatible with tree. "is_present" can be True or False
| insert_tuple:
| If is_compatible: Tuple of (parentnode, childmovelist) parameters for insert_node
| If not is_compatible: None
|
| children(self, parent)
| Returns set containing parent's immediate descendants
|
| cladegrep(self, pattern, minsize=2)
| Finds clades (monophyletic groups) where all leaves contain specified pattern
|
| cluster_cut(self, cutoff)
| Divides tree into clusters by cutting across tree "cutoff" distance from root.
| Returns list containing sets of leafnames
|
| cluster_n(self, nclust)
| Divides tree into 'nclust' clusters based on distance from root.
|
| Returns tuple containing: list with sets of leafnames (one set per cluster)
| list of basenodes of clusters
|
| collapse_clade(self, leaflist, newname='clade')
| Replaces clade (leaves in leaflist) with single leaf.
| Branch length is set to average dist from basenode parent to leaves
|
| copy_treeobject(self, copylengths=True, copylabels=True)
| Returns copy of Tree object. Copies structure and branch lengths.
| Caches and any user-added attributes are not copied.
| Similar to effect of copy.deepcopy but customized and much faster
|
| deroot(self)
| If root is at bifurcation: remove root node, connect adjacent nodes
|
| diameter(self, return_leaves=False)
| Return diameter: longest leaf-leaf distance along tree.
| If return_leaves is True: Return tuple with (maxdist, Leaf1, Leaf2)
|
| figtree(self, printdist=True, printlabels=True, print_leaflabels=False, precision=6, colorlist=None, color='0000FF')
| Returns figtree format tree as a string
|
| find_central_leaf(self, leaflist)
| Finds central leaf for the provided list of leaves.
| Defined as having approximately equal distance to the two farthest leaves in leaflist
|
| find_common_leaf(self, leaflist)
| Finds common leaf for the provided list of leaves.
| Defined as having the smallest average distance to remaining leaves
|
| find_most_distant(self, node1, nodeset)
| Finds node in nodeset that is most distant from node1
|
| find_mrca(self, leaves)
| Finds Most Recent Common Ancestor for the provided set of leaves
|
| findbasenode(self, leafset)
| Finds node that is at the base of all leaves in leafset.
|
| get_branchstruct(self, node1, node2)
| Returns Branchstruct object from branch between node1 and node2
|
| getlabel(self, node1, node2)
| Gets label on branch connecting node1 and node2
|
| graft(self, other, node1, node2=None, blen1=0, blen2=0, graftlabel=None, graft_with_other_root=False)
| Graft other tree to self
|
| tree2 (other) intnodes will be renamed if names clash with those in tree1.
| node1: node in tree1 (self) below which tree2 (other) will be grafted. Cannot be root1
| node2: node in tree2 (other) below which tree2 will be attached (default is root of tree2)
| blen1: length of branch added to tree1 below graftpoint (lower of two newly created branches)
| blen2: length of branch above graft point and below tree2 (upper of two newly created branches)
| graftlabel: prepend value of "label" to leaf names on t2 (e.g: "graft_s1")
| graft_with_other_root: use root of other as graftpoint (i.e., do not add extra basal
| branch between other.root and self.graftpoint)
|
| has_same_root(self, other)
| Compares two trees. Returns True if topologies are same and rooted in same place
|
| height(self)
| Returns height of tree: Largest root-to-tip distance
|
| insert_node(self, parent, childnodes, branchstruct)
| Inserts an extra node between parent and children listed in childnodes list
| (so childnodes are now attached to newnode instead of parent).
| The branchstruct will be attached to the branch between parent and newnode.
| Branches to childnodes retain their original branchstructs.
| The node number of the new node is returned
|
| is_bifurcation(self, node)
| Checks if internal node is at bifurcation (has two children)
|
| is_compatible_with(self, bipart)
| Checks whether a given bipartition is compatible with the tree.
| Note: also returns True if bipartition is already in tree
|
| is_resolved(self)
| Checks whether tree is fully resolved (no polytomies)
|
| leaflist(self)
| Returns list of leaf names sorted alphabetically
|
| length(self)
| Returns tree length (sum of all branch lengths)
|
| match_nodes(self, other)
| Compares two identical trees with potentially different internal node IDs.
| Returns tuple containing following:
| Dictionary giving mapping from nodeid in self to nodeid in other (also leaves)
| unmatched_root1: "None" or id of unmatched root in self if root at bifurcation
| unmatched_root2: "None" or id of unmatched root in other if root at bifurcation
|
| Note: The last two are only different from None if the trees dont have the same
| exact rooting
|
| n_bipartitions(self)
| Returns the number of bipartitions (= number of internal branches) in tree
| Note: if root is at bifurcation, then those 2 branches = 1 bipartition
|
| nameprune(self, sep='_', keep_pattern=None)
| Prune leaves based on name redundancy:
| Find subtrees where all leaves have same start of name (up to first "_")
|
| nearest_n_leaves(self, leaf1, n_neighbors)
| Returns set of N leaves closest to leaf along tree (patristic distance)
|
| nearleafs(self, leaf1, maxdist)
| Returns set of leaves that are less than maxdist from leaf, measured along branches
|
| newick(self, printdist=True, printlabels=True, print_leaflabels=False, precision=6, labelfield='label', transdict=None)
| Returns Newick format tree string representation of tree object
|
| nexus(self, printdist=True, printlabels=True, print_leaflabels=False, precision=6, labelfield='label', translateblock=False)
| Returns nexus format tree as a string
|
| nodedepth(self, node)
| Returns depth of node: distance from furthest leaf-level to node
|
| nodedist(self, node1, node2=None)
| Returns distance between node1 and node2 along tree (patristic distance)
|
| nodedistlist(self, node1, nodelist)
| Returns list of distances from node1 to nodes in nodelist (same order as nodelist)
|
| nodepath(self, node1, node2)
| Returns path between node1 and node2 along tree.
|
| nodepath_fromdict(self, node1, node2)
| Returns path between node1 and node2 along tree, from preconstructed path_dict
|
| numberprune(self, nkeep, keeplist=None, keep_common_leaves=False, keep_most_distant=False, return_leaves=False, enforce_n=False)
| Prune tree so 'nkeep' leaves remain, approximately evenly spaced over tree.
|
| "keeplist" can be used to specify leaves that _must_ be retained.
| 'keep_common_leaves' requests preferential retainment of leaves with many neighbors
| (default is to keep leaves that are as equally spaced as possible)
| 'keep_most_distant' requests that the two most distant leaves in tree
| (which spread out the diameter) should be kept
| 'return_leaves': return selected leaves, but do not actually prune tree
| 'enforce_n' enforce exactly N leaves in pruned tree
| (normally leaves in includelist and most distant are additional to N)
|
| parent(self, node)
| Returns parent of node
|
| patristic_distdict(self)
| Return nested dictionary giving all pairwise, patristic distances:
| dict[node1][node2] = patristic distance
|
| possible_spr_prune_nodes(self)
| Utililty function when using spr function: where is it possible to prune
|
| possible_spr_regraft_nodes(self, prune_node)
| Utility function when using spr function: where is it possible to regraft
| prune_node: the node below which pruning will take place (before regrafting)
|
| prune_maxlen(self, nkeep, return_leaves=False)
| Prune tree so remaining nkeep leaves spread out maximal percentage of branch length
|
| remote_children(self, parent)
| Returns set containing all leaves that are descendants of parent
|
| remote_nodes(self, parent)
| Returns set containing all nodes (intnodes and leaves) that are descendants of parent.
| This set includes parent itself
|
| remove_branch(self, node1, node2)
| Removes branch connecting node1 and node2 (thereby possibly creating polytomy)
| Length of removed branch is distributed among descendant branches.
| This means tree length is conserved.
| Descendant nodes will be farther apart from each other, but closer to outside nodes.
|
| remove_leaf(self, leaf)
| Removes named leaf from tree, cleans up so remaining tree structure is sane
|
| remove_leaves(self, leaflist)
| Removes leaves in list from tree, cleans up so remaining tree structure is sane
|
| rename_intnode(self, oldnum, newnum)
| Changes number of one internal node
|
| rename_leaf(self, oldname, newname, fixdups=False)
| Changes name of one leaf. Automatically fixes duplicates if requested
|
| reroot(self, node1, node2=None, polytomy=False, node1dist=0.0)
| Places new root on branch between node1 and node2, node1dist from node1
|
| resolve(self)
| Randomly resolves multifurcating tree by by adding zero-length internal branches.
|
| rootmid(self)
| Performs midpoint rooting of tree
|
| rootminvar(self)
| Performs minimum variance rooting of tree
|
| rootout(self, outgroup, polytomy=False)
| Roots tree on outgroup
|
| set_branch_attribute(self, node1, node2, attrname, attrvalue)
| Set the value of any branch attribute.
| attrname: Name of attribute (e.g., "length")
| attrvalue: Value of attribute (e.g. 0.153)
|
| set_nodeid_labels(self)
| Sets labels to be the same as the child node ID
| Allows use of e.g. Figtree to show nodeIDs as nodelabels
|
| setlabel(self, node1, node2, label)
| Sets label on branch connecting node1 and node2
|
| setlength(self, node1, node2, length)
| Sets length of branch connecting node1 and node2
|
| shuffle_leaf_names(self)
| Shuffles the names of all leaves
|
| sorted_intnodes(self, deepfirst=True)
| Returns sorted intnode list for breadth-first traversal of tree
|
| spr(self, prune_node=None, regraft_node=None)
| Subtree Pruning and Regrafting.
|
| prune_node: basenode of subtree that will be pruned.
| regraft_node: node in remaining treestump below which subtree will be grafted
|
| If no parameters are specified (both are None): perform random SPR
| If only prune_node is specified: choose random regraft_node
|
| Must specify either both parameters, no parameters, or only prune_node
|
| subtree(self, basenode, return_basalbranch=False)
| Returns subtree rooted at basenode as Tree object
|
| topology(self)
| Returns set of sets of sets representation of topology ("naked bipdict")
|
| transdict(self)
| Returns dictionary of {name:number_as_string} for use in translateblocks
|
| translateblock(self, transdict)
|
| transname(self, namefile)
| Translate all leaf names using oldname/newname pairs in namefile
|
| treedist(self, other, normalise=True, verbose=False)
| Deprecated: Use treedist_RF instead.
| Compute symmetric tree distance (Robinson Foulds) between self and other tree.
| Normalised measure returned by default
|
| treedist_RF(self, other, normalise=False, rooted=False)
| Compute symmetric tree distance (Robinson Foulds) between self and other tree.
| normalise: divide RF distance by the total number of bipartitions in the two trees
| rooted: take position of root into account
|
| treedist_pathdiff(self, other)
| Compute path difference tree-distance between self and other:
| Euclidean distance between nodepath-dist matrices considered as vectors.
| Measure described in M.A. Steel, D. Penny, Syst. Biol. 42 (1993) 126–141
|
| treesim(self, other, verbose=False)
| Compute normalised symmetric similarity between self and other tree
|
| ----------------------------------------------------------------------
| Class methods defined here:
|
| from_biplist(biplist) from builtins.type
| Constructor: Tree object from bipartition list
|
| from_branchinfo(parentlist, childlist, lenlist=None, lablist=None) from builtins.type
| Constructor: Tree object from information about all branches in tree
|
| Information about one branch is conceptually given as:
| parentnodeID, childnodeID, [length], [label]
|
| The function takes as input 2 to 4 separate lists containing:
| IDs of parents (internal nodes, so integer values)
| ID of children (internal or leaf nodes, so integer or string)
| Length of branches (optional)
| Label of branches (optional)
|
| The four lists are assumed to have same length and be in same order (so index n in
| each list corresponds to same branch).
|
| Note: most IDs appear multiple times in lists
| Note 2: can be used as workaround so user can specify IDs for internal nodes
|
| from_leaves(leaflist) from builtins.type
| Constructor: star-tree object from list of leaves
|
| from_string(orig_treestring, transdict=None) from builtins.type
| Constructor: Tree object from tree-string in Newick format
|
| from_topology(topology) from builtins.type
| Constructor: Tree object from topology
|
| randtree(leaflist=None, ntips=None, randomlen=False, name_prefix='s') from builtins.type
| Constructor: tree with random topology from list of leaf names OR number of tips
|
| ----------------------------------------------------------------------
| Readonly properties defined here:
|
| parent_dict
| Lazy evaluation of _parent_dict when needed
|
| ----------------------------------------------------------------------
| Data descriptors defined here:
|
| __dict__
| dictionary for instance variables (if defined)
|
| __weakref__
| list of weak references to the object (if defined)
class TreeError(builtins.Exception)
| Method resolution order:
| TreeError
| builtins.Exception
| builtins.BaseException
| builtins.object
|
| Data descriptors defined here:
|
| __weakref__
| list of weak references to the object (if defined)
|
| ----------------------------------------------------------------------
| Methods inherited from builtins.Exception:
|
| __init__(self, /, *args, **kwargs)
| Initialize self. See help(type(self)) for accurate signature.
|
| ----------------------------------------------------------------------
| Static methods inherited from builtins.Exception:
|
| __new__(*args, **kwargs) from builtins.type
| Create and return a new object. See help(type) for accurate signature.
|
| ----------------------------------------------------------------------
| Methods inherited from builtins.BaseException:
|
| __delattr__(self, name, /)
| Implement delattr(self, name).
|
| __getattribute__(self, name, /)
| Return getattr(self, name).
|
| __reduce__(...)
| Helper for pickle.
|
| __repr__(self, /)
| Return repr(self).
|
| __setattr__(self, name, value, /)
| Implement setattr(self, name, value).
|
| __setstate__(...)
|
| __str__(self, /)
| Return str(self).
|
| with_traceback(...)
| Exception.with_traceback(tb) --
| set self.__traceback__ to tb and return self.
|
| ----------------------------------------------------------------------
| Data descriptors inherited from builtins.BaseException:
|
| __cause__
| exception cause
|
| __context__
| exception context
|
| __dict__
|
| __suppress_context__
|
| __traceback__
|
| args
class TreeSet(builtins.object)
| Class for storing and manipulating a number of trees, which all have the same leafs
|
| Methods defined here:
|
| __getitem__(self, index)
| Implements indexing of treeset.
|
| Simple index returns single tree.
| Slice returns TreeSet object with selected subset of trees
|
| __init__(self)
| Initialize self. See help(type(self)) for accurate signature.
|
| __iter__(self)
| Returns fresh iterator object allowing iteration over Treeset (which is itself an iterable)
|
| __len__(self)
|
| addtree(self, tree)
| Adds Tree object to Treeset object
|
| addtreeset(self, treeset)
| Adds all trees in TreeSet object to this TreeSet object
|
| newick(self, printdist=True, printlabels=True)
| Returns newick format tree as a string
|
| nexus(self, printdist=True, printlabels=True, translateblock=True)
| Returns nexus format tree as a string
|
| rootmid(self)
| Performs midpoint rooting on all trees in TreeSet
|
| ----------------------------------------------------------------------
| Data descriptors defined here:
|
| __dict__
| dictionary for instance variables (if defined)
|
| __weakref__
| list of weak references to the object (if defined)
|
| ----------------------------------------------------------------------
| Data and other attributes defined here:
|
| TreeSetIterator = <class 'phylotreelib.TreeSet.TreeSetIterator'>
class TreeSummary(builtins.object)
| TreeSummary(interner=None)
|
| Class summarizing bipartitions and branch lengths (but not topologies) from many trees
|
| Methods defined here:
|
| __init__(self, interner=None)
| TreeSummary constructor. Initializes relevant data structures
|
| __len__(self)
|
| add_branchid(self)
| Adds attribute .branchID to all bipartitions in .bipartsummary
| External bipartitions are labeled with the leafname.
| Internal bipartitions are labeled with consecutive numbers by decreasing frequency
|
| add_tree(self, curtree, weight=1.0)
| Add tree object to treesummary, update all relevant bipartition summaries
|
| contree(self, cutoff=0.5, allcompat=False, labeldigits=3)
| Returns a consensus tree built from selected bipartitions
|
| log_clade_credibility(self, topology)
| Compute log clade credibility for topology (sum of log(freq) for all branches)
|
| max_clade_cred_tree(self, filelist, skiplist=None, labeldigits=3)
| Find and return maximum clade credibility tree.
| Note: this version based on external treefile (and bipartsummary).
| Skiplist possibly contains number of trees to skip in each file (burnin)
|
| update(self, other)
| Merge this object with external treesummary
|
| ----------------------------------------------------------------------
| Readonly properties defined here:
|
| bipartsummary
| Property method for lazy evaluation of freq, var, and sem for branches
|
| sorted_biplist
| Return list of bipartitions.
| First external (leaf) bipartitions sorted by leafname.
| Then internal bipartitions sorted by freq
|
| ----------------------------------------------------------------------
| Data descriptors defined here:
|
| __dict__
| dictionary for instance variables (if defined)
|
| __weakref__
| list of weak references to the object (if defined)
class Treefile(builtins.object)
| Treefile(filename)
|
| Factory for making Newick or Nexus treefile objects. Autodetects fileformat
|
| Static methods defined here:
|
| __new__(klass, filename)
| Create and return a new object. See help(type) for accurate signature.
|
| ----------------------------------------------------------------------
| Data descriptors defined here:
|
| __dict__
| dictionary for instance variables (if defined)
|
| __weakref__
| list of weak references to the object (if defined)
class TreefileBase(builtins.object)
| TreefileBase(filename=None, filecontent=None)
|
| Abstract base-class for representing tree file objects.
|
| Methods defined here:
|
| __enter__(self)
| Implements context manager behaviour for TreefileBase types.
| Usage example:
| with pt.Newicktreefile(filename) as tf:
| mytree = tf.readtree()
| mytree.rootminvar()
|
| __exit__(self, type, value, traceback)
| Implements context manager behaviour for TreefileBase types.
| Usage example:
| with pt.Newicktreefile(filename) as tf:
| mytree = tf.readtree()
| mytree.rootminvar()
|
| __init__(self, filename=None, filecontent=None)
| Initialize self. See help(type(self)) for accurate signature.
|
| close(self)
| For explicit closing of Treefile before content exhausted
|
| get_treestring(self)
| Return next tree-string
|
| readtree(self)
| Reads one tree from file and returns as Tree object. Returns None when exhausted file
|
| readtrees(self, discardprop=0.0)
| Reads trees from file and returns as TreeSet object. Can discard fraction of trees
|
| ----------------------------------------------------------------------
| Data descriptors defined here:
|
| __dict__
| dictionary for instance variables (if defined)
|
| __weakref__
| list of weak references to the object (if defined)
FUNCTIONS
main()
# # Placeholder: Insert test code here and run module in standalone mode
remove_comments(text, leftdelim, rightdelim=None)
Takes input string and strips away commented text, delimited by 'leftdelim' and 'rightdelim'.
Also deals with nested comments.
FAQs
Analyze and manipulate phylogenetic trees
We found that phylotreelib demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Fluent Assertions is facing backlash after dropping the Apache license for a commercial model, leaving users blindsided and questioning contributor rights.
Research
Security News
Socket researchers uncover the risks of a malicious Python package targeting Discord developers.
Security News
The UK is proposing a bold ban on ransomware payments by public entities to disrupt cybercrime, protect critical services, and lead global cybersecurity efforts.