AlphaFold 3, Glossary, Textbook of AI

AlphaFold 3 (AF3), published by Abramson et al. (Google DeepMind and Isomorphic Labs) in Nature in May 2024, generalises AlphaFold beyond protein monomers to predict the joint 3D structure of arbitrary biomolecular complexes, proteins, nucleic acids (DNA, RNA), small-molecule ligands, ions, and post-translational modifications, from sequence and chemical-graph inputs alone. It replaces AlphaFold 2's deterministic structure module with a diffusion-based generative head, marking the most consequential architectural change in the AlphaFold lineage.

The pipeline retains AF2's Evoformer-style trunk, now called the Pairformer, which alternates triangle-multiplication and triangle-attention updates over a pair representation of all atoms (not just residues) but drops the explicit MSA "row" attention that was AF2's most expensive component. This makes AF3 lighter and removes the strong dependence on multiple sequence alignment quality that limited AF2 on novel proteins.

Structure prediction is then performed by a conditional diffusion model operating directly in 3D atomic coordinates. Given the trunk's pair representation as conditioning, the diffusion network learns to denoise random Gaussian point clouds back to a plausible structure. Training uses the standard score-matching objective $\mathcal{L} = \mathbb{E}_{t,\mathbf{x}_0,\boldsymbol{\epsilon}} \big[\ ,\|\hat{\boldsymbol{\epsilon}}_\theta(\mathbf{x}_t, t, \mathbf{c}) - \boldsymbol{\epsilon}\|^2\ ,\big]$ over noise schedule $t$, with augmentations that randomly rotate and translate frames so the model learns SE(3)-equivariance by data augmentation rather than architecture. At inference, multiple diffusion samples are drawn and ranked by a learned confidence head producing pLDDT and predicted alignment error analogous to AF2.

Empirically AF3 is the first model to do well on classes of problems AF2 could not address. On the PoseBusters benchmark for protein–ligand docking it achieves accuracy comparable to or better than dedicated docking tools (Vina, Glide, Gold) without any explicit search over poses. On protein–nucleic acid complexes it substantially outperforms the previous best (RoseTTAFold-NA), and on antibody–antigen interfaces , historically a weakness of AF2, accuracy improves markedly when the antibody pair is included in the input. Across the PDB validation set the median DockQ for protein–protein interfaces rises from ~0.40 (AF-Multimer) to ~0.62.

The model is not without limitations. Diffusion sampling occasionally produces hallucinated structures with high confidence, physically implausible but locally well-packed, particularly for disordered regions and homo-oligomeric symmetry. AF3 also runs only via the AlphaFold Server with usage caps and a non-commercial licence, restricting drug-discovery applications and frustrating the open-source community that flourished around AF2. Open re-implementations (Boltz-1, Chai-1, Protenix) have appeared within months and broadly replicate the headline results.

Video

Related terms: AlphaFold, Protein Folding, Diffusion Model, ESM-2, RFDiffusion

Discussed in:

Chapter 17: Applications, Drug Discovery and Structural Biology

AI tools used: Claude (research, coding, text), ChatGPT (diagrams, images), Grammarly (editing).