Before and after AlphaFold2: An overview of protein structure prediction

Carregando...
Imagem de Miniatura
Citações na Scopus
29
Tipo de produção
article
Data de publicação
2023
Título da Revista
ISSN da Revista
Título do Volume
Editora
FRONTIERS MEDIA SA
Citação
FRONTIERS IN BIOINFORMATICS, v.3, article ID 1120370, 8p, 2023
Projetos de Pesquisa
Unidades Organizacionais
Fascículo
Resumo
Three-dimensional protein structure is directly correlated with its function and its determination is critical to understanding biological processes and addressing human health and life science problems in general. Although new protein structures are experimentally obtained over time, there is still a large difference between the number of protein sequences placed in Uniprot and those with resolved tertiary structure. In this context, studies have emerged to predict protein structures by methods based on a template or free modeling. In the last years, different methods have been combined to overcome their individual limitations, until the emergence of AlphaFold2, which demonstrated that predicting protein structure with high accuracy at unprecedented scale is possible. Despite its current impact in the field, AlphaFold2 has limitations. Recently, new methods based on protein language models have promised to revolutionize the protein structural biology allowing the discovery of protein structure and function only from evolutionary patterns present on protein sequence. Even though these methods do not reach AlphaFold2 accuracy, they already covered some of its limitations, being able to predict with high accuracy more than 200 million proteins from metagenomic databases. In this mini-review, we provide an overview of the breakthroughs in protein structure prediction before and after AlphaFold2 emergence.
Palavras-chave
protein structure prediction, AlphaFold, template-based modeling, free modeling, protein language model
Referências
  1. Agnihotry S., 2022, Bioinforma.: Methods Appl., P177, DOI 10.1016/B978-0-323-89775-4.00023-7
  2. [Anonymous], 2005, Principles of Biochemistry
  3. Azzaz F, 2022, BIOMOLECULES, V12, DOI 10.3390/biom12101527
  4. Bepler T, 2021, CELL SYST, V12, P654, DOI 10.1016/j.cels.2021.05.017
  5. Bongirwar V, 2022, PROG BIOPHYS MOL BIO, V173, P72, DOI 10.1016/j.pbiomolbio.2022.05.002
  6. Bouatta N, 2021, ACTA CRYSTALLOGR D, V77, P982, DOI 10.1107/S2059798321007531
  7. Bryant P, 2022, NAT COMMUN, V13, DOI 10.1038/s41467-022-28865-w
  8. Buel GR, 2022, NAT STRUCT MOL BIOL, V29, P1, DOI 10.1038/s41594-021-00714-2
  9. Callaway E, 2022, NATURE, V604, P234, DOI 10.1038/d41586-022-00997-5
  10. Dill KA, 2007, CURR OPIN STRUC BIOL, V17, P342, DOI 10.1016/j.sbi.2007.06.001
  11. Dorn M, 2014, COMPUT BIOL CHEM, V53, P251, DOI 10.1016/j.compbiolchem.2014.10.001
  12. Duran-Frigola M, 2013, CHEM BIOL, V20, P674, DOI 10.1016/j.chembiol.2013.03.004
  13. Evans R., 2021, bioRxiv, DOI [DOI 10.1101/2021.10.04.463034, 10.04.463034]
  14. FRUTON JS, 1985, P AM PHILOS SOC, V129, P313
  15. Gromiha M. M., 2019, Encycl. Bioinforma. Comput. Biol. ABC Bioinforma, P445, DOI 10.1016/B978-0-12-809633-8.20278-1
  16. Guex N, 2009, ELECTROPHORESIS, V30, pS162, DOI 10.1002/elps.200900140
  17. Hardin C, 2002, CURR OPIN STRUC BIOL, V12, P176, DOI 10.1016/S0959-440X(02)00306-8
  18. Hazra S., 2021, Importance of protein structure and function in pathogenesis: Highlights on the multifaceted organism Mycobacterium tuberculosis, DOI [10.1016/b978-0-12-820084-1.00030-2, DOI 10.1016/B978-0-12-820084-1.00030-2]
  19. Hekkelman ML, 2023, NAT METHODS, V20, P205, DOI 10.1038/s41592-022-01685-y
  20. Higgins MK, 2021, J MOL BIOL, V433, DOI 10.1016/j.jmb.2021.167093
  21. Johansson-Åkhe I, 2022, FRONT BIOINFORM, V2, DOI 10.3389/fbinf.2022.959160
  22. Jones DT, 1999, J MOL BIOL, V287, P797, DOI 10.1006/jmbi.1999.2583
  23. Jumper J, 2021, PROTEINS, V89, P1711, DOI 10.1002/prot.26257
  24. Jumper J, 2021, NATURE, V596, P583, DOI 10.1038/s41586-021-03819-2
  25. Kiefer F, 2009, NUCLEIC ACIDS RES, V37, pD387, DOI 10.1093/nar/gkn750
  26. Kryshtafovych A, 2019, PROTEINS, V87, P1011, DOI 10.1002/prot.25823
  27. Lin ZM, 2022, bioRxiv, DOI [10.1101/2022.07.20.500902, 10.1101/2022.07.20.500902, DOI 10.1101/2022.07.20.500902]
  28. Mirdita M, 2022, NAT METHODS, V19, P679, DOI [10.1038/s41592-022-01488-1, 10.5281/ZENODO.5123297]
  29. Nassar R, 2021, J MOL BIOL, V433, DOI 10.1016/j.jmb.2021.167126
  30. Noone DP, 2022, P NATL ACAD SCI USA, V119, DOI 10.1073/pnas.2208144119
  31. Oxford Protein Informatics Group, 2021, AlphaFold 2 is here: what's behind the structure prediction miracle
  32. Paiva VD, 2022, COMPUT BIOL MED, V147, DOI 10.1016/j.compbiomed.2022.105695
  33. Pak M., 2021, bioRxiv, DOI [DOI 10.1101/2021.09.19.460937V1, 10.1101/2021.09.19.460937, DOI 10.1101/2021.09.19.460937]
  34. Pearce R, 2021, J BIOL CHEM, V297, DOI 10.1016/j.jbc.2021.100870
  35. Perrakis A, 2021, EMBO REP, V22, DOI 10.15252/embr.202154046
  36. Porta-Pardo E, 2022, PLOS COMPUT BIOL, V18, DOI 10.1371/journal.pcbi.1009818
  37. Rost B, 1997, J MOL BIOL, V270, P471, DOI 10.1006/jmbi.1997.1101
  38. Ruff KM, 2021, J MOL BIOL, V433, DOI 10.1016/j.jmb.2021.167208
  39. Saldaño T, 2022, BIOINFORMATICS, V38, P2742, DOI 10.1093/bioinformatics/btac202
  40. Sanjeevi M., 2022, Methods and applications of machine learning in structure-based drug discovery, DOI [10.1016/B978-0-323-90264-9.00025-8, DOI 10.1016/B978-0-323-90264-9.00025-8]
  41. Scardino V, 2023, ISCIENCE, V26, DOI 10.1016/j.isci.2022.105920
  42. Senior AW, 2020, NATURE, V577, P706, DOI 10.1038/s41586-019-1923-7
  43. Senior AW, 2019, PROTEINS, V87, P1141, DOI 10.1002/prot.25834
  44. Skolnick J, 2021, J CHEM INF MODEL, V61, P4827, DOI 10.1021/acs.jcim.1c01114
  45. Stevens AO, 2022, BIOMOLECULES, V12, DOI 10.3390/biom12070985
  46. Terwilliger TC, 2022, NAT METHODS, V19, P1376, DOI 10.1038/s41592-022-01645-6
  47. Varadi M, 2023, PROTEOMICS, V23, DOI 10.1002/pmic.202200128
  48. Varadi M, 2022, NUCLEIC ACIDS RES, V50, pD439, DOI 10.1093/nar/gkab1061
  49. Voet D., 2014, Fundamental of biochemistry: Life at the molecular level, V4th editio
  50. Weissenow K., 2022, Ultra-fast protein structure prediction to capture effects of sequence variation in mutation movies, P1, DOI [10.1101/2022.11.14.516473, DOI 10.1101/2022.11.14.516473]
  51. Weissenow K, 2022, STRUCTURE, V30, P1169, DOI 10.1016/j.str.2022.05.001
  52. Wisniak J., 2000, Chem. Educ, V5, P343, DOI [10.1007/s00897000430a, DOI 10.1007/S00897000430A]
  53. Wong F, 2022, MOL SYST BIOL, V18, DOI 10.15252/msb.202211081
  54. Xu D, 2012, PROTEINS, V80, P1715, DOI 10.1002/prot.24065
  55. Yang JY, 2020, P NATL ACAD SCI USA, V117, P1496, DOI 10.1073/pnas.1914677117
  56. Yin R, 2022, PROTEIN SCI, V31, DOI 10.1002/pro.4379
  57. Yuan X, 2003, COMP FUNCT GENOM, V4, P397, DOI 10.1002/cfg.305