The Nobel Prize in Chemistry 2024 is all about proteins and their folding patterns: solving the challenge on how to predict their structure from a given sequence and their sequence from a given structure.
The Royal Swedish Academy of Sciences has decided to award the Prize to David Baker (University of Washington and Howard Hughes Medical Institute, USA) “for computational protein design” and jointly to Demis Hassabis and John M. Jumper (Google DeepMind, UK) “for protein structure prediction”.
Life could not exist without proteins; they control all the chemical transformations that constitute the basis of life. Proteins also function as hormones, signal substances, antibodies and the building blocks of different tissues. Made up of just 20 different amino acids, each string twists and folds into a distinct 3D structure, which in fact confers proteins their function and purpose. Therefore, the following key question arises: how does a protein find its unique structure?
A given 100-amino acid protein can take up to 1047 different conformations; however, a protein assumes exactly the same shape every time. In the 1960s, it was already concluded that a protein’s 3D structure is entirely governed by the sequence of amino acids in the protein.
That led us to the challenge: to be able to predict the protein structure based on a known amino acid sequence. The task was so relevant that the Critical Assessment of Protein Structure Prediction (CASP) project was created in 1994. Solving the prediction problem proved incredibly difficult, taking almost 25 years for a significant breakthrough: an improvement in accuracy from 40% to 60% achieved by AlphaFold, an AI software by Google DeepMind.
Only a couple years later, AlphaFold2 was already performing almost as well as X-ray crystallography, a true milestone. It was Demis Hassabis and John M. Jumper who made it happen. What’s more, they inmmediately calculated the structure of ca. 200 million of the proteins that researchers have so far discovered when mapping Earth’s organisms.
Still and all, the structure prediction problem can also be formulated in another way; that is, what amino acid sequence do you need to obtain a certain folding pattern? We are talking here about protein design, when a target structure is envisaged, and possible sequences are identified by computational methods. Here enters David Baker, also a keen participant in CASP, who has successfully built new-to-nature proteins with his software Rosetta. With a specific purpose in mind, he can design a completely new protein and make it function as desired!
Such a leap in protein biology would not have been possible without the previous efforts from structural biologists in providing all the experimentally determined structures that have gone into the Protein Data Bank. Both AlphaFold2 and Rosetta have also been made available to the public, a testament on the importance of sharing scientific data to truly push the boundaries of scientific knowledge.
Most monomeric protein structures can now be predicted with high fidelity, and large databases of hundreds of millions of structures have thus been created, with huge impact on biochemical and biological research. Likewise, completely new protein structures, not found in nature, can now be created by computational design and used in various biotechnological and biomedical applications.
For those working in drug discovery, AlphaFold2 is a well-known technology already embedded in the search for new small molecules. The impressive accuracy of AI methods to compile 3D structures has made protein targets more accessible to the drug design process. The impact on the fields of medicinal chemistry and chemical biology is enormous: from target selection and validation to accelerated structure determination and improved in-silico screening. Better designs of New Chemical Entities and faster property optimization, with the potential to afford greater numbers of good lead compounds to feed the pre-clinical stages of discovery programs and subsequently lead to better medicines.
And it doesn’t stop here, AlphaFold3 was presented a few months ago, with not only improved accuracy in protein-ligand interaction; it is now able to tackle biomolecular interactions of more complex structures containing proteins, DNA, RNA and more.
Once again, congratulations to Hassabis, Jumper and Baker, a very well deserved prize!