Library for the Development and Use of Phylogenetic Network Methods
Information-criterion-based reticulation-count selection (AIC, BIC, AICc) and reticulation sweep helpers.
One row of the reticulation sweep: results for a single ``k``. Attributes: k: Reticulation count this row was produced with. best_log_lik: Best log-likelihood observed across seeds at this ``k`` (higher is better for MPL). all_log_liks: Per-seed log-likelihoods, in the order the seeds were evaluated. Useful for diagnosing multimodality. n_params: Effective parameter count used for AIC/BIC. aic: ``2 * n_params - 2 * best_log_lik``. bic: ``n_params * ln(data_size) - 2 * best_log_lik``. elapsed_s: Wall-clock time spent on this ``k`` (all seeds). delta_log_lik: Marginal gain over the previous ``k`` row; set by :func:`reticulation_sweep` after all rows are built. ``None`` for the first row.
Container for reticulation-sweep rows with selection/plotting helpers. Built by :func:`reticulation_sweep`. Exposes :meth:`best_by` for criterion-based ``k`` recommendation, :meth:`print_summary` for a console table, :meth:`save_csv` for a machine-readable dump, and :meth:`plot` for a visual report. Attributes: rows: Ordered list of :class:`SweepRow`, one per ``k``. data_size: ``n`` used in the BIC formula. params_per_reticulation: Parameters added by each reticulation. base_params: Backbone (k=0) parameter count. log_lik_label: Human-readable name for the y-axis / summary column (e.g. ``"log-pseudo-likelihood"``).
Return the recommended ``k`` under the given criterion.
| Parameter | Type | Description |
|---|---|---|
| criterion | One of * ``"logL"``: ``argmax_k logL(k)`` (ignores parsimony; always picks the largest ``k`` in a non-overfit regime). * ``"aic"``: ``argmin_k AIC(k)``. * ``"bic"``: ``argmin_k BIC(k)``. * ``"elbow"``: smallest ``k`` at which the next-step gain in log-likelihood falls below ``elbow_tol_frac`` of the maximum gain across the sweep. This matches the classic "knee plot" heuristic. |
ValueError: If the sweep is empty or ``criterion`` is unrecognised.Pretty-print the sweep table, deltas, and recommendations.
| Parameter | Type | Description |
|---|---|---|
| file | Optional text stream to print to. Defaults to ``sys.stdout``. |
Write the sweep rows to ``path`` as CSV, one row per ``k``. Parent directories are created on demand. Per-seed scores are joined into a single ``;``-separated field so the CSV stays flat.
| Parameter | Type | Description |
|---|---|---|
| path | Output CSV path. |
Render the log-likelihood / AIC / BIC curves versus ``k``. Writes a PNG to ``path`` (if provided) and/or opens an interactive window (``show=True``). Each recommended ``k`` is marked with a vertical dashed line colour-matched to its criterion.
| Parameter | Type | Description |
|---|---|---|
| path | Optional PNG output path. | |
| show | When True, also display an interactive window. | |
| title | Optional figure title. |
Run ``search_fn`` over each ``k in k_values`` and summarize.
| Parameter | Type | Description |
|---|---|---|
| search_fn | Callable ``f(k, seed) -> float`` that performs one search with ``max_reticulations == k`` and returns the best log-likelihood found. The caller is responsible for constructing a fresh search object / starting network for each invocation if that's appropriate for the method. | |
| k_values | Reticulation counts to sweep over (e.g. ``range(0, 4)``). | |
| seeds | RNG seeds to run at each ``k``. If multiple seeds are provided the best (highest) log-likelihood across seeds is taken as the representative score for that ``k``. | |
| data_size | ``n`` used in BIC (``p ln(n) - 2 logL``). For MPL this is typically ``len(gene_trees.trees)``. | |
| params_per_reticulation | Number of free parameters each additional reticulation contributes. For MPL, 3 (one gamma + two new branch lengths) is a sensible default; 1 (gamma only) is the most conservative choice. | |
| base_params | Parameters attributable to the backbone tree (commonly ``2 * n_taxa - 3`` for an unrooted binary tree; pass ``0`` to ignore -- only differences across ``k`` matter for the AIC/BIC comparison.) | |
| log_lik_label | Y-axis label, e.g. "log-pseudo-likelihood". | |
| progress | Print a one-line progress update per search. |