ISSN 0253-2778

CN 34-1054/N

Open AccessOpen Access JUSTC Life Sciences 01 April 2024

Supporting the CIF file format of proteins in molecular dynamics simulations

Cite this:
https://doi.org/10.52396/JUSTC-2023-0148
More Information
  • Author Bio:

    Hengyue Wang is currently a graduate student in the School of Physics, University of Science and Technology of China, under the supervision of Prof. Zhiyong Zhang. His research mainly focuses on computer simulations of large biomolecular complex assemblies

    Zhiyong Zhang is currently a Professor in the Department of Physics, University of Science and Technology of China (USTC). He received his Ph.D. degree in Biochemistry and Molecular Biology from USTC in 2003. His research interests include method development on multiscale modeling and integrative modeling of large biomolecular complexes

  • Corresponding author: E-mail: zzyzhang@ustc.edu.cn
  • Received Date: 17 October 2023
  • Accepted Date: 21 December 2023
  • Available Online: 01 April 2024
  • Molecular dynamics (MD) simulations can capture the dynamic behavior of proteins in full atomic detail and at very fine temporal resolution, so they have become an important tool in the study of protein dynamics. To date, several MD packages are widely used. An MD simulation starts from an initial structure that is generally taken from the Protein Data Bank (PDB). Until 2014, the PDB format was the standard file format for protein structures. However, there are certain intrinsic limitations in the PDB format, such as the storage of structural information in a fixed-width format, which is an issue for very large protein complexes. Therefore, the CIF (crystallographic information framework) format has been proposed, which is characterized by its superior expansibility. To our knowledge, the current mainstream MD packages support only the PDB format but do not support the CIF format directly. In this study, we modified the source code of one of the MD packages, GROMACS, which enables it to support CIF-formatted structure files as input and subsequently generate molecular topology files. This work simplifies the preprocessing of large protein complexes for MD simulations.
    The CIF file format of proteins can be directly used to generate topology files for molecular dynamics simulations.
    Molecular dynamics (MD) simulations can capture the dynamic behavior of proteins in full atomic detail and at very fine temporal resolution, so they have become an important tool in the study of protein dynamics. To date, several MD packages are widely used. An MD simulation starts from an initial structure that is generally taken from the Protein Data Bank (PDB). Until 2014, the PDB format was the standard file format for protein structures. However, there are certain intrinsic limitations in the PDB format, such as the storage of structural information in a fixed-width format, which is an issue for very large protein complexes. Therefore, the CIF (crystallographic information framework) format has been proposed, which is characterized by its superior expansibility. To our knowledge, the current mainstream MD packages support only the PDB format but do not support the CIF format directly. In this study, we modified the source code of one of the MD packages, GROMACS, which enables it to support CIF-formatted structure files as input and subsequently generate molecular topology files. This work simplifies the preprocessing of large protein complexes for MD simulations.
    • We modified the source code in one of the MD packages, GROMACS, which enables direct support of CIF files of proteins.
    • The modified program in GROMACS can read CIF files of proteins successfully and generate correct topology files.
    • This work simplifies the preprocessing of large protein complexes for MD simulations when only CIF files are available.

  • loading
  • [1]
    Hospital A, Goñi J R, Orozco M, et al. Molecular dynamics simulations: advances and applications. Advances and Applications in Bioinformatics and Chemistry, 2015, 8: 37–47. doi: 10.2147/AABC.S70333
    [2]
    Hollingsworth S A, Dror R O. Molecular dynamics simulation for all. Neuron, 2018, 99 (6): 1129–1143. doi: 10.1016/j.neuron.2018.08.011
    [3]
    Van Der Spoel D, Lindahl E, Hess B, et al. GROMACS: fast, flexible, and free. Journal of Computational Chemistry, 2005, 26 (16): 1701–1718. doi: 10.1002/jcc.20291
    [4]
    Case D A, Cheatham III T E, Darden T, et al. The Amber biomolecular simulation programs. Journal of Computational Chemistry, 2005, 26 (16): 1668–1688. doi: 10.1002/jcc.20290
    [5]
    Phillips J C, Hardy D J, Maia J D C, et al. Scalable molecular dynamics on CPU and GPU architectures with NAMD. The Journal of Chemical Physics, 2020, 153 (4): 044130. doi: 10.1063/5.0014475
    [6]
    Brooks B R, Brooks III C L, Mackerell Jr A D, et al. CHARMM: the biomolecular simulation program. Journal of Computational Chemistry, 2009, 30 (10): 1545–1614. doi: 10.1002/jcc.21287
    [7]
    Berman H, Henrick K, Nakamura H. Announcing the worldwide Protein Data Bank. Nature Structural & Molecular Biology, 2003, 10 (12): 980. doi: 10.1038/nsb1203-980
    [8]
    wwPDB consortium. Protein Data Bank: the single global archive for 3D macromolecular structure data. Nucleic Acids Research, 2019, 47 (D1): D520–D528. doi: 10.1093/nar/gky949
    [9]
    Callaway J, Cummings M, Deroski B, et al. Protein Data Bank contents guide: Atomic coordinate entry format description. Upton: Brookhaven National Laboratory, 1996 .
    [10]
    Westbrook J D, Fitzgerald P M D. The PDB Format, mmCIF Formats, and Other Data Formats. In: Bourne P E, Weissig H, editors. Structural Bioinformatics. Hoboken: John Wiley & Sons, Inc. , 2003 .
    [11]
    Zhao G, Perilla J R, Yufenyuy E L, et al. Mature HIV-1 capsid structure by cryo-electron microscopy and all-atom molecular dynamics. Nature, 2013, 497 (7451): 643–646. doi: 10.1038/nature12162
    [12]
    Khalid S, Brandner A F, Juraschko N, et al. Computational microbiology of bacteria: Advancements in molecular dynamics simulations. Structure, 2023, 31 (11): 1320–1327. doi: 10.1016/j.str.2023.09.012
    [13]
    Chua E Y D, Mendez J H, Rapp M, et al. Better, faster, cheaper: recent advances in cryo–electron microscopy. Annual Review of Biochemistry, 2022, 91: 1–32. doi: 10.1146/annurev-biochem-032620-110705
    [14]
    Fitzgerald P M D, Berman H, Bourne P, et al. The mmCIF dictionary: community review and final approval. Acta Crystallographica Section A, 1996, 52: C575. doi: 10.1107/S0108767396076593
    [15]
    Hall S R, Allen F H, Brown I D. The crystallographic information file (CIF): a new standard archive file for crystallography. Acta Crystallographica Section A, 1991, 47(6): 655–685. doi: 10.1107/S010876739101067X
    [16]
    Berman H M, Kleywegt G J, Nakamura H, et al. The Protein Data Bank archive as an open data resource. Journal of Computer-Aided Molecular Design, 2014, 28 (10): 1009–1014. doi: 10.1007/s10822-014-9770-y
    [17]
    van Ginkel G, Pravda L, Dana J M, et al. PDBeCIF: an open-source mmCIF/CIF parsing and processing package. BMC Bioinformatics, 2021, 22 (1): 383. doi: 10.1186/s12859-021-04271-9
    [18]
    Weaver L H, Matthews B W. Structure of bacteriophage T4 lysozyme refined at 1.7 Å resolution. Journal of Molecular Biology, 1987, 193 (1): 189–199. doi: 10.1016/0022-2836(87)90636-X
    [19]
    Mosalaganti S, Obarska-Kosinska A, Siggel M, et al. AI-based structure prediction empowers integrative structural analysis of human nuclear pores. Science, 2022, 376 (6598): eabm9506. doi: 10.1126/science.abm9506
    [20]
    Case D A, Aktulga H M, Belfon K A A, et al. Amber 2021. San Francisco: University of California, 2021 .
  • 加载中

Catalog

    Figure  1.  Flowchart of how GROMACS supports PDB and CIF files. Yellow: the same functions in processing PDB and CIF, red: PDB functions, and blue: CIF functions.

    Figure  2.  The systems used to test the modified GROMACS. (a) The bacteriophage T4 lysozyme (2LZM). (b) The dilated human nuclear pore complex (7R5J). Since the whole complex has C8 symmetry, only one-eighth of the structure is included in the CIF file.

    Figure  3.  Screen output from the modified GROMACS. (a) Output when generating topology files from 2LZM.cif. (b) Output when generating topology files from 2LZM.pdb. (c) Output when generating topology files from 7R5J.cif.

    [1]
    Hospital A, Goñi J R, Orozco M, et al. Molecular dynamics simulations: advances and applications. Advances and Applications in Bioinformatics and Chemistry, 2015, 8: 37–47. doi: 10.2147/AABC.S70333
    [2]
    Hollingsworth S A, Dror R O. Molecular dynamics simulation for all. Neuron, 2018, 99 (6): 1129–1143. doi: 10.1016/j.neuron.2018.08.011
    [3]
    Van Der Spoel D, Lindahl E, Hess B, et al. GROMACS: fast, flexible, and free. Journal of Computational Chemistry, 2005, 26 (16): 1701–1718. doi: 10.1002/jcc.20291
    [4]
    Case D A, Cheatham III T E, Darden T, et al. The Amber biomolecular simulation programs. Journal of Computational Chemistry, 2005, 26 (16): 1668–1688. doi: 10.1002/jcc.20290
    [5]
    Phillips J C, Hardy D J, Maia J D C, et al. Scalable molecular dynamics on CPU and GPU architectures with NAMD. The Journal of Chemical Physics, 2020, 153 (4): 044130. doi: 10.1063/5.0014475
    [6]
    Brooks B R, Brooks III C L, Mackerell Jr A D, et al. CHARMM: the biomolecular simulation program. Journal of Computational Chemistry, 2009, 30 (10): 1545–1614. doi: 10.1002/jcc.21287
    [7]
    Berman H, Henrick K, Nakamura H. Announcing the worldwide Protein Data Bank. Nature Structural & Molecular Biology, 2003, 10 (12): 980. doi: 10.1038/nsb1203-980
    [8]
    wwPDB consortium. Protein Data Bank: the single global archive for 3D macromolecular structure data. Nucleic Acids Research, 2019, 47 (D1): D520–D528. doi: 10.1093/nar/gky949
    [9]
    Callaway J, Cummings M, Deroski B, et al. Protein Data Bank contents guide: Atomic coordinate entry format description. Upton: Brookhaven National Laboratory, 1996 .
    [10]
    Westbrook J D, Fitzgerald P M D. The PDB Format, mmCIF Formats, and Other Data Formats. In: Bourne P E, Weissig H, editors. Structural Bioinformatics. Hoboken: John Wiley & Sons, Inc. , 2003 .
    [11]
    Zhao G, Perilla J R, Yufenyuy E L, et al. Mature HIV-1 capsid structure by cryo-electron microscopy and all-atom molecular dynamics. Nature, 2013, 497 (7451): 643–646. doi: 10.1038/nature12162
    [12]
    Khalid S, Brandner A F, Juraschko N, et al. Computational microbiology of bacteria: Advancements in molecular dynamics simulations. Structure, 2023, 31 (11): 1320–1327. doi: 10.1016/j.str.2023.09.012
    [13]
    Chua E Y D, Mendez J H, Rapp M, et al. Better, faster, cheaper: recent advances in cryo–electron microscopy. Annual Review of Biochemistry, 2022, 91: 1–32. doi: 10.1146/annurev-biochem-032620-110705
    [14]
    Fitzgerald P M D, Berman H, Bourne P, et al. The mmCIF dictionary: community review and final approval. Acta Crystallographica Section A, 1996, 52: C575. doi: 10.1107/S0108767396076593
    [15]
    Hall S R, Allen F H, Brown I D. The crystallographic information file (CIF): a new standard archive file for crystallography. Acta Crystallographica Section A, 1991, 47(6): 655–685. doi: 10.1107/S010876739101067X
    [16]
    Berman H M, Kleywegt G J, Nakamura H, et al. The Protein Data Bank archive as an open data resource. Journal of Computer-Aided Molecular Design, 2014, 28 (10): 1009–1014. doi: 10.1007/s10822-014-9770-y
    [17]
    van Ginkel G, Pravda L, Dana J M, et al. PDBeCIF: an open-source mmCIF/CIF parsing and processing package. BMC Bioinformatics, 2021, 22 (1): 383. doi: 10.1186/s12859-021-04271-9
    [18]
    Weaver L H, Matthews B W. Structure of bacteriophage T4 lysozyme refined at 1.7 Å resolution. Journal of Molecular Biology, 1987, 193 (1): 189–199. doi: 10.1016/0022-2836(87)90636-X
    [19]
    Mosalaganti S, Obarska-Kosinska A, Siggel M, et al. AI-based structure prediction empowers integrative structural analysis of human nuclear pores. Science, 2022, 376 (6598): eabm9506. doi: 10.1126/science.abm9506
    [20]
    Case D A, Aktulga H M, Belfon K A A, et al. Amber 2021. San Francisco: University of California, 2021 .

    Article Metrics

    Article views (453) PDF downloads(854)
    Proportional views

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return