Search and Compare
Some types of search, such as by sequence or structure similarity, require specialized tools.
Find others like 3kwd
FoldSeek
Foldseek is a protein structure similarity lookup algorithm.
Inputs
PDB file: Protein structure file.
Database: A specific database to search against.
Outputs
List of hits: A list of database hits, structurally aligned to the input.
TM-Score: Score of structural identity between two proteins. Ranges from 0.0 to 1.0, with 0.0 being no match, 1.0 being exact match, and 0.5 being same fold/family.
Fident: Percentage sequence identity between the database hit and input after structural alignment. 100 means identical sequence match.
Runtime
Scales with the size of the database used. Typically 0.5 to 1.0 min.
Database
Several standard databases are available.
- Protein Data Bank (PDB): 218 K
- AlphaFold: 214 M
Find similar proteins to MA998
Lookup structures like P42212
Blastp
Blast is a DNA sequence similarity lookup algorithm. Blastp is specifically for protein sequences.
Inputs
Protein sequence: Single chain amino acid sequence.
Database: A specific blastp database to search against.
Outputs
Multiple sequence alignment: A list of database hits, with sequences aligned to the input (aka query).
E-value: A stastical measure of the significance of a hit. Lower is more signficant, while higher means hit is more likely due to random chance.
Pident: Percentage sequence identity between the database hit (aka subject) to the input (aka query) after alignment. 100 means identical match.
Runtime
Scales with the size of the database used. Typically 1 to 30 min.
Database
Many standard databases are available. It is possible create custom databases for the search.
- UniProt: 231 M
- NR/NT: 707 M