Skip to main content

ESMFold

ESMFold is a machine learning model that predicts protein 3D structures from amino acid sequences using transformer-based neural networks. Developed under the Evolutionary Scale Modeling project, it leverages evolutionary data for fast and accurate predictions. While local installation and coding are required to run ESMFold, it can be easily accessed on the Copilot web-based platform with simple text prompts.

Inputs

  • Protein Sequence: A string representing the single-letter amino acid sequence, consisting of the common 20 amino acids. This input is intended for single-chain (monomeric) proteins only.

Fold ‘DIHICGICKQQFNNLDAFVAHKQSGSQ’

Fold A0A7S7MT40

Fold 'GIGDPVTCLKSGAICHPVFCPRRYKQIGTCGLPGTKCCKKP'

Defensin

Outputs

  • PDB file: File containing the predicted 3D coordinates of each residue in the input sequence.
  • pLDDT Scores: Specified in the b-factor column of the PDB file. A per residue confidence score between 0 and 100 with higher being better.

Analyzing ESMFold Predictions

pLDDT Scores:

Predicted local distance difference test (pLDDT) is a per residue confidence score. 0-50: very low (correlated with disorder/flexibility), 50-70: low, 70-90: high, 90-100: very high (highly structured/stable).

PAE Scores:

Predicted aligned error (PAE) is a residue pair confidence. 0-5 angstroms (low): The relative position between the 2 residues is known (they move together). 20+ anstroms (high): The relative position is not known (they move independently of each other)