Skip to main content

MPM3

MPM3 is a transformer-based AI model for molecule programming that can be used for redesign, diversification, or de novo (from scratch) design of protein sequences. Latest version is MPM4. Protein design is the process of creating proteins with specific functions by manipulating the amino acid sequence. Scientists design proteins to create anti-cancer drugs (e.g. antibodies), gene editing medicines (e.g. Cas9), laundry detergents (e.g. amylases), digestible milk (e.g. lactases), and more. The terms protein engineering or optimization are usually for small changes on top of a natural starting point, while the term protein design usually makes more extensive changes on top of some starting point, and the term de novo protein design implies creation of a sequence from scratch (though may still have a known starting point).

Redesign A0A1L9RXX7 residues 40-60

design1

Redesign Inputs

Protein sequence: A single chain protein sequence with length less than 300 amino acids.

Residue range: A continuous range of residues to be redesigned. E.g. 50-60.

Function (optional): A keyword describing a protein function. E.g. "leucine rich repeat" or "hydrolase".

Temperature (optional): Lower introduces fewer changes, higher introduces more changes.

Redesign Outputs

Protein sequence: A modified single chain protein sequence with length equal to input length. Only the residue range specified is modified.

Redesign Examples

Redesign Q06750 residues 31-52 at temperature 1.2, and show 3 results.

design2

Diversify Inputs

Protein sequence: A single chain protein sequence with length less than 300 amino acids.

Function (optional): A keyword describing a protein function. E.g. "leucine rich repeat" or "hydrolase".

Temperature (optional): Lower introduces fewer changes, higher introduces more changes.

Diversify Outputs

Protein sequence: A modified single chain protein sequence. Length is usually equal to input length, though rare insertions or deletions are possible.

Diversify Examples

Diversify Q7L266 with temperature 2.0

design3

From Scratch Inputs

Function: A keyword describing a protein function. E.g. "leucine rich repeat" or "hydrolase".

From Scratch Outputs

Protein sequence: A single chain protein sequence. Length less than 300 amino acids.

From Scratch Examples

Create a de novo hydrolase

design4

Create an asparagine protein

design5

Analyze Protein Design Results

While ultimately, protein design results must be evaluated experimentally, there are computational evaluations possible.

  • Function and Property Prediction: When a good, independent computational predictor of the desired function or property is available, it can provide an excellent validation of designed protein sequences. Unfortunately, this is most often not available.
  • Structural Examination: A combination of structure prediction and expert knowledge can be used to evaluate a function that is closely tied to protein structure. For example, locating a catalytic triad for an enzyme or preserving a known binding motif.