310 Protein Design ML Stack logo

Proteins
ML Stack

import lib310

seqs = lib310.db.fetch(name='SPIKE_SARS2', feature='sequence') 

model = lib310.ml.GoAnnotation(model='TALE', v='512_756_4l')
embeddings = model.run(seqs).embedding

lib310.plot.umap(embeddings, clustering='kcluster')

4 Components of Our Stack

we are building a four piece stack to accelerate protein design and computational genetics research. 

1 Data: PFS

Working with protein data has 2 sets of challenges. First, the structure and distribution of sequences are different from more ML-studied modalities like images and texts. Second, various contexts and datasets are located in many siloed datasets with vastly different formats, querying standards, quality and levels of curation.

310 Protein Design ML Stack logo

Proteins Feature Store

2 Models: DFTLN-VQVAE

Most SOTA ML protein design architectures borrow latest techniques from image or text generation. However, despite having similarities with other modalities of data, protein features and labels have distributions that are quite different. We borrow some of the newer techniques such as CLIP and VQ-VAE and adjust them for the protein world, and build our own Deep Funnel TL Network technology to deliver the most fit architecture. 

3 Python Library: Lib310

We are building an open source python library for accessing data and models -both downloading and serving-. Our lib310 library also provides simple data science and visualization tools customized for the biotech and protein worlds.

Data Analytics

Models Serving

Data Science Tools

Visualization

4 AppStore

The AppStore is the collection of apps built on top of our platforms and made public by various research groups and contributors. They vary from educational materials and data science tools all the way to potential drug research. We believe this helps the industry to speed up and build new products on top of community works.

Cancer Immunotherapy
de novo Protein Design
Motifs 101
Enzyme Evolution
Antibody development
Covid MRNA Vaccine
Sequence Generation
Motifs 101