Library for the Development and Use of Phylogenetic Network Methods
The Alphabet module provides character-to-state mappings for phylogenetic data types including DNA, RNA, Protein, and Codon sequences. Uses generalized one-hot encoding based on binary conversions.
A frozen dataclass that stores a name and mapping dictionary.
| Attribute | Type | Description |
|---|---|---|
| name | str | Name of the alphabet mapping |
| mapping | dict[str, int] | Character to state mapping |
| Constant | Type | Description |
|---|---|---|
DNA |
AlphabetMapping | DNA nucleotide mapping (A, C, G, T, ambiguity codes) |
RNA |
AlphabetMapping | RNA nucleotide mapping (A, C, G, U, ambiguity codes) |
PROTEIN |
AlphabetMapping | Amino acid mapping (20+ amino acids) |
CODON |
AlphabetMapping | Codon mapping for coding sequences |
The DNA alphabet uses a generalized one-hot encoding scheme based on base-10 to binary conversions. IUPAC ambiguity codes are supported:
Error class for all errors relating to alphabet mappings.
Generate an SNP alphabet mapping for a given ploidy level.
| Parameter | Type | Description |
|---|---|---|
| ploidy | int | Maximum ploidy value (e.g., 2 for diploid organisms) |
Class that handles mapping from characters to state values with partial likelihood values associated with them.
Initialize with a predefined or custom alphabet mapping.
| Parameter | Type | Description |
|---|---|---|
| mapping | AlphabetMapping | One of {DNA, RNA, PROTEIN, CODON} or a custom mapping |
Return the integer state mapping for a character.
| Parameter | Type | Description |
|---|---|---|
| char | str | A character from sequence data |
AlphabetError - If the character is undefined for this alphabet
Get the character that maps to a given state.
| Parameter | Type | Description |
|---|---|---|
| state | int | A state value in the alphabet |
AlphabetError - If the state is undefined for this alphabet
Returns the name of the alphabet type (e.g., "DNA", "PROTEIN").
from PhyNetPy.Alphabet import Alphabet, DNA, RNA, PROTEIN, snp_alphabet
# Create a DNA alphabet
dna_alpha = Alphabet(DNA)
# Map characters to states
state_a = dna_alpha.map('A') # Returns 1
state_t = dna_alpha.map('T') # Returns 8
state_x = dna_alpha.map('X') # Returns 15 (any nucleotide)
# Reverse mapping
char = dna_alpha.reverse_map(1) # Returns 'A'
# Get alphabet type
print(dna_alpha.get_type()) # "DNA"
# Create SNP alphabet for diploid organisms
snp_alpha = snp_alphabet(ploidy=2)
snp = Alphabet(snp_alpha)
# Map SNP values
snp.map('0') # 0
snp.map('1') # 1
snp.map('2') # 2
snp.map('-') # 3 (missing data)