evolocity - Evolutionary velocity with protein language models

API¶

Import evolocity as:

import evolocity as evo

After reading the data (evo.pp.featurize_fasta) or loading an in-built dataset (evo.datasets.*), the typical workflow consists of subsequent calls of preprocessing (evo.pp.*), analysis tools (evo.tl.*), and plotting (evo.pl.*).

Preprocessing (pp)¶

Featurization (language model embedding)

`pp.featurize_seqs`(seqs[, model_name, mkey, …])	Embeds a list of sequences.
`pp.featurize_fasta`(fname[, model_name, …])	Embeds a FASTA file.

Landscape (nearest neighbors graph construction)

pp.neighbors(adata[, n_neighbors, n_pcs, …])

Construct sequence similarity neighborhood graph.

Tools (tl)¶

Velocity estimation

`tl.velocity_graph`(adata[, model_name, mkey, …])	Computes velocity scores at each edge in the graph.
`tl.velocity_embedding`(data[, basis, vkey, …])	Projects the velocities into any embedding.

Pseudotime and trajectory inference

`tl.terminal_states`(data[, vkey, groupby, …])	Computes terminal states (root and end points).
`tl.velocity_pseudotime`(adata[, vkey, …])	Computes pseudotime based on the evolocity graph.

Interpretation

`tl.onehot_msa`(adata[, reference, …])	Aligns and one-hot-encodes sequences.
`tl.residue_scores`(adata[, basis, scale, …])	Score mutations by associated evolocity.
`tl.random_walk`(data[, root_node, …])	Runs a random walk on the evolocity graph.

Plotting (pl)¶

Also see scanpy’s plotting API for additional visualization functionality, including UMAP scatter plots.

Velocity embeddings

`pl.velocity_embedding`(adata[, basis, vkey, …])	Scatter plot of velocities on the embedding.
`pl.velocity_embedding_grid`(adata[, basis, …])	Scatter plot of velocities on a grid.
`pl.velocity_embedding_stream`(adata[, basis, …])	Stream plot of velocities on the embedding.
`pl.velocity_contour`(adata[, ptkey, …])	Contour plot of pseudotime with velocity grid.

Mutation interpretation

`pl.residue_scores`(adata[, percentile_keep, …])	Heat map of per-residue velocity scores.
`pl.residue_categories`(adata[, positions, …])	Scatter plot of mutations.

Datasets¶

`datasets.nucleoprotein`()	Influenza A nucleoprotein.
`datasets.cytochrome_c`()	Eukaryotic cytochrome c.

Settings¶

set_figure_params([style, dpi, dpi_save, …])

Set resolution/size, styling and format of figures.