Skip to main content

Search and Compare

Some types of search, such as by sequence or structure similarity, require specialized tools.

Find others like 3kwd

3kwd

FoldSeek

Foldseek is a protein structure similarity lookup algorithm.

Inputs

PDB file: Protein structure file.
Database: A specific database to search against.

Outputs

List of hits: A list of database hits, structurally aligned to the input.
TM-Score: Score of structural identity between two proteins. Ranges from 0.0 to 1.0, with 0.0 being no match, 1.0 being exact match, and 0.5 being same fold/family.
Fident: Percentage sequence identity between the database hit and input after structural alignment. 100 means identical sequence match.

Runtime

Scales with the size of the database used. Typically 0.5 to 1.0 min.

Database

Several standard databases are available.

  • Protein Data Bank (PDB): 218 K
  • AlphaFold: 214 M

Find similar proteins to MA998

MA998

Lookup structures like P42212

GFP

Blastp

Blast is a DNA sequence similarity lookup algorithm. Blastp is specifically for protein sequences.

Inputs

Protein sequence: Single chain amino acid sequence.
Database: A specific blastp database to search against.

Outputs

Multiple sequence alignment: A list of database hits, with sequences aligned to the input (aka query).
E-value: A stastical measure of the significance of a hit. Lower is more signficant, while higher means hit is more likely due to random chance.
Pident: Percentage sequence identity between the database hit (aka subject) to the input (aka query) after alignment. 100 means identical match.

Runtime

Scales with the size of the database used. Typically 1 to 30 min.

Database

Many standard databases are available. It is possible create custom databases for the search.

  • UniProt: 231 M
  • NR/NT: 707 M

Integration with Other Tools

Databases