Library for the Development and Use of Phylogenetic Network Methods
Time-reversible nucleotide substitution models (GTR, JC, K80, F81, HKY, K81, SYM, TN93).
Class of exception that gets raised when there is an error in the formulation of a substitution model, whether it be inputs that don't adhere to requirements or there is an issue in computation.
General superclass for time reversable substitution models. Implements Eigenvalue decomposition for computing e^(Q*t). Special case subclasses attempt to improve on the time complexity of the matrix exponential operation. This is the Generalized Time Reversible (GTR) model.
Create a GTR substitution model object with the required/needed parameters.
| Parameter | Type | Description |
|---|---|---|
| base_freqs | list[float] | An array of floats of 'states' length. Must sum to 1. |
| transitions | list[float] | An array of floats that is ('states'^2 - 'states') / 2 long. |
| states | int, optional | Number of possible data states. Defaults to 4 (For DNA, {A, C, G, T}). |
SubstitutionModelError: If the base frequency or transition arrays are malformed.Get the Q matrix.
Change any of the base frequencies/states/transitions parameters, and recompute the Q matrix accordingly.
| Parameter | Type | Description |
|---|---|---|
| params | dict[str, Any] | A mapping from gtr parameter names to their values. For the GTR superclass, names must be limited to ["states", "base frequencies", "transitions"]. Parameter value type for "states" is an |
| int | parameter value type for "base frequencies" and "transitions" is a list[float]. |
SubstitutionModelError: If parameters are malformed/invalid.Gets the base frequency and transition arrays.
Get the number of states for this substitution model.
Populate the normalized Q matrix with the correct values. Based on (1)
Compute the matrix exponential e^(Q*t) and store the result. If the solution has been computed already but the Q matrix has not changed, simply return the value
| Parameter | Type | Description |
|---|---|---|
| t | float | Generally going to be a positive number for phylogenetic applications. Represents time, in coalescent units or any other unit. |
For DNA only (4 states, 6 transitions). Kimura 2 parameter model from (2). Also known as K80. Parameterized by alpha and beta, the transversion and transition parameters. Base frequencies are assumed to be all equal at .25. Transition probabilities are = [alpha, beta, alpha, alpha, beta, alpha]
Initialize K80 model.
| Parameter | Type | Description |
|---|---|---|
| alpha | float | transversion param |
| beta | float | transition param |
SubstitutionModelError: if alpha and beta do not sum to 1.Change any of the base frequencies/states/transitions parameters, and recompute the Q matrix accordingly.
| Parameter | Type | Description |
|---|---|---|
| params | dict[str, float ] | A mapping from gtr parameter names to their values. For the K80 class, names must be limited to ["alpha", "beta"]. |
SubstitutionModelError: If parameters are malformed/invalid.Compute the matrix exponential e^(Q*t) and store the result. If the solution has been computed already but the Q matrix has not changed, simply return the value. For K2P, a closed form solution for e^(Q*t) exists and we do not need to perform any exponentiation.
| Parameter | Type | Description |
|---|---|---|
| t | float | Generally going to be a positive number for phylogenetic applications. Represents time, in coalescent units or any other unit. |
For DNA only (4 states, 6 transitions). Formulated by Felsenstein in 1981, this substitution model assumes that all base frequencies are free, but all transition probabilities are equal. A closed form for the matrix (Q) exponential exists.
Initialize the F81 model with a list of base frequencies of length 4. Transition probabilities will all be the same.
| Parameter | Type | Description |
|---|---|---|
| bases | list[float] | a list of 4 base frequency values. |
SubstitutionModelError: If the base frequencies given do not sum to 1 or if the list does not have exactly 4 elements.Change the base frequency parameter, and recompute the Q matrix accordingly.
| Parameter | Type | Description |
|---|---|---|
| params | dict[str, list[float]] | A mapping from gtr parameter names to their values. For the F81 |
| class | names must be limited to ["base frequencies"]. |
SubstitutionModelError: If the base frequencies given do not sum to 1 or the list is over 4 elements long.For DNA only (4 states, 6 transitions). The Jukes Cantor model is the simplest of all time reversible models, in which all parameters (transitions, base frequencies) are assumed to be equal. A closed form for the matrix exponential, e^(Q*t), exists.
No arguments need to be provided, as the JC Q matrix is fixed.
For DNA only (4 states, 6 transitions). Developed by Hasegawa et al. Transversion parameters are assumed to be equal and the transition parameters are assumed to be equal. Base frequency parameters are free.
Initialize the HKY model with 4 base frequencies that sum to 1, and a transition array of length 6 with the equivalency pattern [a, b, a, a, b, a].
| Parameter | Type | Description |
|---|---|---|
| base_freqs | list[float] | Array of 4 values that sum to 1. |
| transitions | list[float] | Array of length 6 with the equivalency pattern [a, b, a, a, b, a]. |
SubstitutionModelError: If inputs are malformed in any way.Change any of the base frequencies/states/transitions parameters, and recompute the Q matrix accordingly.
| Parameter | Type | Description |
|---|---|---|
| params | dict[str, list[float]] | A mapping from gtr parameter names to their values. For the HKY class, names must be limited to ["base frequencies", "transitions"] |
SubstitutionModelError: If parameters are malformed/invalid.For DNA only (4 states, 6 transitions). Developed by Kimura in 1981. Base frequencies are assumed to be equal, and transition probabilities are assumed to be parameterized by the pattern [a, b, c, c, b, a].
Initialize with a list of 6 transition probabilities that follow the pattern [a, b, c, c, b, a]. All base frequencies are assumed to be equal.
| Parameter | Type | Description |
|---|---|---|
| transitions | list[float] | A list of floats, 6 long. |
SubstitutionModelError: If the transition probabilities are not of correct pattern.Change the transitions parameters, and recompute the Q matrix accordingly.
| Parameter | Type | Description |
|---|---|---|
| params | dict[str, list[float]] | A mapping from gtr parameter names to their values. For the K81 class, names must be limited to ["transitions"]. |
SubstitutionModelError: If the parameters are malformed/invalid.For DNA only (4 states, 6 transitions). Developed by Zharkikh in 1994, this model assumes that all base frequencies are equal, and all transition probabilities are free.
Initialize with a list of 6 free transition probabilities. Base frequencies are all equal.
| Parameter | Type | Description |
|---|---|---|
| transitions | list[float] | A list of 6 transition rates. |
SubstitutionModelError: if the transitions array is not of length 6.Change any of the base frequencies/states/transitions parameters, and recompute the Q matrix accordingly.
| Parameter | Type | Description |
|---|---|---|
| params | dict[str, list[float]] | A mapping from gtr parameter names to their values. For the SYM class, names must be limited to ["transitions"]. |
SubstitutionModelError: if the transitions array is not of length 6For DNA only (4 states, 6 transitions). Developed by Tamura and Nei in 1993. Similar to HKY, but two different transition parameters are used instead of one (0=2=3=5, 1 != 4). Base frequency parameters are free.
Initialize with a list of 4 free base frequencies, and 6 transitions that follow the pattern [a, b, a, a, c, a].
| Parameter | Type | Description |
|---|---|---|
| base_freqs | list[float] | A list of 4 base frequencies |
| transitions | list[float] | A list of 6 transitions that follow the above pattern. |
SubstitutionModelError: If the transitions or base frequency lists are malformed.Change any of the base frequencies/transitions parameters, and recompute the Q matrix accordingly.
| Parameter | Type | Description |
|---|---|---|
| params | dict[str, list[float]] | A mapping from gtr parameter names to their values. For the TN93 |
| class | names must be limited to ["base frequencies", "transitions"] |
SubstitutionModelError: If the new parameters are invalid.