evolocity - Evolutionary velocity with protein language models
API¶
Import evolocity as:
import evolocity as evo
After reading the data (evo.pp.featurize_fasta
) or loading an in-built dataset (evo.datasets.*
),
the typical workflow consists of subsequent calls of
preprocessing (evo.pp.*
), analysis tools (evo.tl.*
), and plotting (evo.pl.*
).
Preprocessing (pp)¶
Featurization (language model embedding)
|
Embeds a list of sequences. |
|
Embeds a FASTA file. |
Landscape (nearest neighbors graph construction)
|
Construct sequence similarity neighborhood graph. |
Tools (tl)¶
Velocity estimation
|
Computes velocity scores at each edge in the graph. |
|
Projects the velocities into any embedding. |
Pseudotime and trajectory inference
|
Computes terminal states (root and end points). |
|
Computes pseudotime based on the evolocity graph. |
Interpretation
|
Aligns and one-hot-encodes sequences. |
|
Score mutations by associated evolocity. |
|
Runs a random walk on the evolocity graph. |
Plotting (pl)¶
Also see scanpy’s plotting API for additional visualization functionality, including UMAP scatter plots.
Velocity embeddings
|
Scatter plot of velocities on the embedding. |
|
Scatter plot of velocities on a grid. |
|
Stream plot of velocities on the embedding. |
|
Contour plot of pseudotime with velocity grid. |
Mutation interpretation
|
Heat map of per-residue velocity scores. |
|
Scatter plot of mutations. |
Datasets¶
Influenza A nucleoprotein. |
|
Eukaryotic cytochrome c. |
Settings¶
|
Set resolution/size, styling and format of figures. |