Skip to main content

MP4

MP4 is a general transformer-based text2protein AI model for molecule programming. It is capable of redesigning (fixed or variable length) based on a given input sequence or completely de novo design based on input text describing protein function. Over 1,000 de novo text2protein examples designed by MP4 can be found in a repo here. And over 6,000 text2enzyme examples covering Enzyme Commission (EC) space designed by MP4 can be found here.

Import from MP4 repo ‘MAGCH’

design4_2

Inputs for Protein Sequence Design

  • Text Description: A keyword, sentence, or paragraph describing the desired protein function, such as “hydrolase”, "pig protein is involved in enhancing cell processes and is located in both surface and internal cell regions", or "This protein is involved in the cellular amine metabolic process and exhibits phosphoribosylanthranilate isomerase activity. It plays a role in modifying amines and converting phosphoribosylanthranilate to a different molecular form."
  • Amino Acid Sequence (optional): A single-chain protein sequence with a length up to 2,000 amino acids.
  • Temperature (optional): A parameter that influences the design; lower values introduce fewer changes, while higher values introduce more changes.

Outputs from MP4

  • Designed Protein Sequence: The optimized amino acid sequence generated based on the input parameters.
  • Confidence Scores: Metrics indicating the reliability of the predictions for the designed sequences.

MP4 Protein Design Example​s

import from MP4 repo ‘MQ8S5’

design4_1