Library for the Development and Use of Phylogenetic Network Methods
Data matrix storage and reduction for sequence alignments with unique site pattern compression.
This exception is raised when there is an error either in parsing data into the matrix object, or if there is an error during any sort of operation
Class that stores and reduces MSA data to only the relevant/unique sites that exist. The only reduction mechanism so far is applicable only to DNA. All other data types will simply be stored in a 2d numpy matrix. Accepts any data that is defined by the Alphabet class in Alphabet.py.
Takes one single MSA object, along with an Alphabet object, represented as either DNA, RNA, PROTEIN, CODON, or USER. The default is DNA.
| Parameter | Type | Description |
|---|---|---|
| alignment | MSA | Multiple Sequence Alignment (MSA) object. |
| alphabet | Alphabet, optional | An alphabet for mapping characters to numerics. Defaults to Alphabet(DNA). |
Stores and simplifies the MSA data.
Reduces the matrix of data by removing non-unique site patterns, and records the location and count of the unique site patterns.
Returns the data point at row i, and column j.
| Parameter | Type | Description |
|---|---|---|
| i | int | row index |
| j | int | column index |
get the character at row i, column j in the character matrix that is associated with the data.
| Parameter | Type | Description |
|---|---|---|
| i | int | row index |
| j | int | column index |
Retrieves the row index of the taxa that has name 'label'
| Parameter | Type | Description |
|---|---|---|
| label | str | name of a taxon. |
Gets the array of characters for a given taxon.
| Parameter | Type | Description |
|---|---|---|
| label | str | the name of a taxon. |
Gets the numerical data for a given taxon with the name 'label'.
| Parameter | Type | Description |
|---|---|---|
| label | str | name of a taxon |
Returns ith column of a data matrix, with 'sites' elements
| Parameter | Type | Description |
|---|---|---|
| i | int | column index |
| data | np.ndarray | a matrix |
| sites | int | dimension of the column |
Returns ith column of the data matrix
| Parameter | Type | Description |
|---|---|---|
| i | int | column index |
Returns the number of unique sites in the MSA/Data
Generates a count list that maps the ith distinct column to the number of times it appears in the original alignment matrix.
| Parameter | Type | Description |
|---|---|---|
| new_data | np.ndarray | The simplified data matrix, that only has distinct column values. |
Get the character matrix from the matrix of alphabet states.
Get the number of taxa represented in this matrix.
Get the name of the taxa associated with the row of data at 'index'
| Parameter | Type | Description |
|---|---|---|
| index | int | a row index |
Get the type of data of this matrix