Quickstart for beginners
2020-09-15
Amo (AmoAi V2.7.5) is a computational platform for analyses of bioligical data and applications in bioengineering. This tutorial provides its principles, architecture, and usage for users.
Protein sequence analysis & design
- Choose which computational tool to use
- Enter a job name (optional)
- Enter your name (optional)
- Enter your email address (required), a notice email will be sent to this address when job is completed
- Select the type of a given file. If the option of multiple aligned sequence is chosen, all analyses are based on the user-defined sequences. Otherwise, a multiple sequence alignment is achived by searching a single sequence provided by user against the database if the option of single sequence is selected
- Paste the sequence profile (FASTA format)
- Upload the sequence profile (FASTA format)
- Upload the associated PDB (optional). The inferred residue communities are mapped to the tertiary structure if the PDB is provided
- Offset (optional), set it if the starting index of the first amino acid is not 0
- Numer of mutants (optional). It is used to set the number of mutations that can be mutated together
- Databases
Protein sequence statistics
In the calculations, the distribution of similarities between pairwise sequences is computed from a generated multiple sequence alignment (MSA) of a given protein (left). The degree of conservation at each position (not include the gaps) in the MSA is measured by relative entropy (right), and it is to indicate how much an amino acid at a specific position could be conserved. The degree of conservation can also how an amino acid at an evolutionarily conserved position is relavent to functionally important sites of protein, moreover, mutations in these positions could be harmful to protein function.
Evolutionary signatures inferred from sequences
The coupled interactions between pairwise amino acids of the protein are analyzed by spectrum analysis and ranked by the eigenvalues. The reduced matrix of interactions are achived (left). In order to look into the networks of residues that play important roles in protein function, the residue communities are defined based on the eigenvalues and preserve the positive interactions among amino acids (right).
Protein single mutation
Complete single mutagenesis. The matrix that is computed by the SAEC method shows ΔE — the energy difference of each mutant sequence with each mutation τ at the ith site and wild-type sequence, negative values representing favourable while positive representing unfavourable mutations.
Protein coupled mutations
Based on the best-so-far sequence (designed), one can conduct a signle or multiple (coupled) mutants on the WT sequence, and the energy of the mutant sequence is computed for evaluating the mutant(s). As an example, each two amino acids are combined to be mutated together and the energy differences are to measure quanlity of the coupled mutants.
Protein sequence design
The energy trajectories of designing the sequence from the WT sequence, and the WT and mutant sequences are listed below.
WT sequence (energy: -147.990):
123456789|123456789|123456789|123456789|123456789|123
VCSEQAETGPCRAMISRWYFDVTEGKCAPFFYGGCGGNRNNFDTEEYCMAVCG
Mutant sequence (best-so-far, energy: -167.354):
123456789|123456789|123456789|123456789|123456789|123
VCSEPAETGPCRAMISRWYYDPKTGKCEPFLYSGCGGNGNNFETKEECEETCK