← Back to PhyNetPy

PhyNetPy Documentation

Library for the Development and Use of Phylogenetic Network Methods

GeneTrees Module v2.0.0

The GeneTrees module provides a container for managing collections of gene trees with support for cluster/split support calculations, consensus tree building, Robinson-Foulds distance computation, and ASTRAL integration.

Author:
Mark Kessler
Last Edit:
9/16/25
Source:
GeneTrees.py

Exceptions

exception GeneTreeError(Exception)

Error class for all errors relating to gene trees.

Helper Functions

def phynetpy_naming(taxa_name: str) -> str

Default method for sorting taxa labels into groups. Expects format where first 2 characters are numeric and 3rd is alphabetic.

Returns: str - Group key (uppercase letter)
def external_naming(taxa_name: str) -> str

Alternative naming rule that splits on underscore and returns the first part.

GeneTrees Class

class GeneTrees

A container for a set of networks that are binary and represent gene trees.

Constructor

__init__(self, gene_tree_list: list[Network] = None, naming_rule: Callable = phynetpy_naming)

Initialize a GeneTrees collection.

Parameter Type Description
gene_tree_list list[Network] Initial list of gene trees to add
naming_rule Callable Function for mapping taxa names to groups

Tree Management

add(self, tree: Network) -> None

Add a gene tree to the collection.

mp_allop_map(self) -> Dict[str, List[str]]

Create a subgenome mapping from the stored gene trees using the naming rule.

Support Calculations

cluster_support(self, include_trivial: bool = False, normalize: bool = True) -> Dict[FrozenSet[str], float]

Aggregate support for all rooted clusters across the gene tree set.

Returns: Dict[FrozenSet[str], float] - Map of clusters to count/frequency
split_support(self, normalize: bool = True) -> Dict[FrozenSet[str], float]

Aggregate support for unrooted splits (bipartitions), canonicalized to the smaller side of the split.

support_on_reference(self, ref_tree: Network, include_trivial: bool = False, normalize: bool = True) -> Dict[FrozenSet[str], float]

Compute support of each rooted cluster present in a reference tree.

annotate_reference_support(self, ref_tree: Network, include_trivial: bool = False, normalize: bool = True) -> None

Annotate the reference tree's internal edges with support values stored in the edge weight field.

Consensus and Distance

consensus_clusters(self, threshold: float = 0.5, include_trivial: bool = False) -> List[Set[str]]

Return clusters with support >= threshold.

build_majority_rule_consensus_tree(self, threshold: float = 0.5) -> Network

Construct a majority-rule consensus tree using greedy compatibility.

rf_distance(self, ref_tree: Network, normalize: bool = False) -> float

Compute average Robinson-Foulds distance between each gene tree and a reference tree.

gene_concordance_factors(self, ref_tree: Network) -> Dict[Tuple[str, str], float]

Compute split-based concordance factor per internal edge of the reference.

Species Tree Inference

astral(self, astral_jar_path: str, mapping_rule: Callable = external_naming, extra_args: List[str] = None) -> Network

Infer a species tree using ASTRAL from the stored gene trees.

Parameter Type Description
astral_jar_path str Path to the ASTRAL JAR file
mapping_rule Callable Function to map gene names to species
extra_args List[str] Additional ASTRAL command-line arguments

Duplication-Loss Analysis

duplication_loss_summary(self, species_tree: Network, naming_rule: Callable = external_naming) -> Dict[str, Any]

Reconcile each gene tree against a species tree using LCA mapping and report total duplications and losses.

Returns: Dict - Contains "totals" and "per_tree" breakdowns

Usage Examples

from PhyNetPy.GeneTrees import GeneTrees, external_naming
from PhyNetPy.NetworkParser import NetworkParser

# Load gene trees
parser = NetworkParser("gene_trees.nex")
trees = parser.get_all_networks()

# Create GeneTrees collection
gt = GeneTrees(trees, naming_rule=external_naming)

# Compute cluster support
support = gt.cluster_support(normalize=True)
for cluster, freq in support.items():
    if freq >= 0.5:
        print(f"{cluster}: {freq:.2%}")

# Build majority-rule consensus tree
consensus = gt.build_majority_rule_consensus_tree(threshold=0.5)
print(consensus.newick())

# Compute Robinson-Foulds distance to a reference
ref_tree = parser.get_network(0)
avg_rf = gt.rf_distance(ref_tree, normalize=True)
print(f"Average RF distance: {avg_rf:.4f}")

# Annotate reference tree with support values
gt.annotate_reference_support(ref_tree)
for edge in ref_tree.E():
    print(f"{edge}: support = {edge.get_weight():.2%}")

# Infer species tree with ASTRAL
species_tree = gt.astral("/path/to/astral.jar")

# Duplication-loss analysis
dl_summary = gt.duplication_loss_summary(species_tree)
print(f"Total duplications: {dl_summary['totals']['duplications']}")
print(f"Total losses: {dl_summary['totals']['losses']}")

See Also

Navigation

Modules

This Page