Skip to main content

Small Molecule Databases

Many public databases are available with information on proteins, small molecules, RNA, cells, and more. Many datasets are aggregates of other datasets and therefore overlap between databases can be high. While some datasets are extensively curated by hand, most are automated and therefore redundancy within a dataset can also be high.

Find the caffeine molecule

PubChem

PubChem is a small moelcule database with chemcial properties, biological activities, and other functional data. Information is automatically curated.

  • Num Compounds: 118 M
  • Substances: 319 M
  • Bioactivities: 295 M

Load molecule rapamycin

Import CC(=O)OC1=CC=CC=C1C(=O)O

Cambridge Structural Database (CCDC)

CCDC is a small molecule experimental structure database.

  • Num Rows: 1.25 M