Library for the Development and Use of Phylogenetic Network Methods
The GeneTrees module provides a container for managing collections of gene trees with support for cluster/split support calculations, consensus tree building, Robinson-Foulds distance computation, and ASTRAL integration.
Error class for all errors relating to gene trees.
Default method for sorting taxa labels into groups. Expects format where first 2 characters are numeric and 3rd is alphabetic.
Alternative naming rule that splits on underscore and returns the first part.
A container for a set of networks that are binary and represent gene trees.
Initialize a GeneTrees collection.
| Parameter | Type | Description |
|---|---|---|
| gene_tree_list | list[Network] | Initial list of gene trees to add |
| naming_rule | Callable | Function for mapping taxa names to groups |
Add a gene tree to the collection.
Create a subgenome mapping from the stored gene trees using the naming rule.
Aggregate support for all rooted clusters across the gene tree set.
Aggregate support for unrooted splits (bipartitions), canonicalized to the smaller side of the split.
Compute support of each rooted cluster present in a reference tree.
Annotate the reference tree's internal edges with support values stored in the edge weight field.
Return clusters with support >= threshold.
Construct a majority-rule consensus tree using greedy compatibility.
Compute average Robinson-Foulds distance between each gene tree and a reference tree.
Compute split-based concordance factor per internal edge of the reference.
Infer a species tree using ASTRAL from the stored gene trees.
| Parameter | Type | Description |
|---|---|---|
| astral_jar_path | str | Path to the ASTRAL JAR file |
| mapping_rule | Callable | Function to map gene names to species |
| extra_args | List[str] | Additional ASTRAL command-line arguments |
Reconcile each gene tree against a species tree using LCA mapping and report total duplications and losses.
from PhyNetPy.GeneTrees import GeneTrees, external_naming
from PhyNetPy.NetworkParser import NetworkParser
# Load gene trees
parser = NetworkParser("gene_trees.nex")
trees = parser.get_all_networks()
# Create GeneTrees collection
gt = GeneTrees(trees, naming_rule=external_naming)
# Compute cluster support
support = gt.cluster_support(normalize=True)
for cluster, freq in support.items():
if freq >= 0.5:
print(f"{cluster}: {freq:.2%}")
# Build majority-rule consensus tree
consensus = gt.build_majority_rule_consensus_tree(threshold=0.5)
print(consensus.newick())
# Compute Robinson-Foulds distance to a reference
ref_tree = parser.get_network(0)
avg_rf = gt.rf_distance(ref_tree, normalize=True)
print(f"Average RF distance: {avg_rf:.4f}")
# Annotate reference tree with support values
gt.annotate_reference_support(ref_tree)
for edge in ref_tree.E():
print(f"{edge}: support = {edge.get_weight():.2%}")
# Infer species tree with ASTRAL
species_tree = gt.astral("/path/to/astral.jar")
# Duplication-loss analysis
dl_summary = gt.duplication_loss_summary(species_tree)
print(f"Total duplications: {dl_summary['totals']['duplications']}")
print(f"Total losses: {dl_summary['totals']['losses']}")