evolocity - Evolutionary velocity with protein language models

API

Import evolocity as:

import evolocity as evo

After reading the data (evo.pp.featurize_fasta) or loading an in-built dataset (evo.datasets.*), the typical workflow consists of subsequent calls of preprocessing (evo.pp.*), analysis tools (evo.tl.*), and plotting (evo.pl.*).

Preprocessing (pp)

Featurization (language model embedding)

pp.featurize_seqs(seqs[, model_name, mkey, …])

Embeds a list of sequences.

pp.featurize_fasta(fname[, model_name, …])

Embeds a FASTA file.

Landscape (nearest neighbors graph construction)

pp.neighbors(adata[, n_neighbors, n_pcs, …])

Construct sequence similarity neighborhood graph.

Tools (tl)

Velocity estimation

tl.velocity_graph(adata[, model_name, mkey, …])

Computes velocity scores at each edge in the graph.

tl.velocity_embedding(data[, basis, vkey, …])

Projects the velocities into any embedding.

Pseudotime and trajectory inference

tl.terminal_states(data[, vkey, groupby, …])

Computes terminal states (root and end points).

tl.velocity_pseudotime(adata[, vkey, …])

Computes pseudotime based on the evolocity graph.

Interpretation

tl.onehot_msa(adata[, reference, …])

Aligns and one-hot-encodes sequences.

tl.residue_scores(adata[, basis, scale, …])

Score mutations by associated evolocity.

tl.random_walk(data[, root_node, …])

Runs a random walk on the evolocity graph.

Plotting (pl)

Also see scanpy’s plotting API for additional visualization functionality, including UMAP scatter plots.

Velocity embeddings

pl.velocity_embedding(adata[, basis, vkey, …])

Scatter plot of velocities on the embedding.

pl.velocity_embedding_grid(adata[, basis, …])

Scatter plot of velocities on a grid.

pl.velocity_embedding_stream(adata[, basis, …])

Stream plot of velocities on the embedding.

pl.velocity_contour(adata[, ptkey, …])

Contour plot of pseudotime with velocity grid.

Mutation interpretation

pl.residue_scores(adata[, percentile_keep, …])

Heat map of per-residue velocity scores.

pl.residue_categories(adata[, positions, …])

Scatter plot of mutations.

Datasets

datasets.nucleoprotein()

Influenza A nucleoprotein.

datasets.cytochrome_c()

Eukaryotic cytochrome c.

Settings

set_figure_params([style, dpi, dpi_save, …])

Set resolution/size, styling and format of figures.