OPUS 2018 | Significance

Background and Significance

Modeling protein structure and dynamics is nowadays critical for the understanding of the functioning of living cells as well as of their malfunction leading, e.g., to cancer or hereditary diseases. This research is thus significant in drug design. Currently, there are about 87,000,000 known protein sequences stored in the UniProt database (www.uniprot.org) (about 500,000 of which were verified and manually annotated), while the number of solved structures in the PDB database (www.rcsb.org) [17] is only 121,460. Therefore, modeling (prediction) of protein structure has nowadays become a standard procedure in biochemistry, biophysics, and biomedicine. Knowledge-based (bioinformatics) methods appear to be the most efficient ones at the moment [24]; however, they are database-dependent and, moreover, make increasing use of force fields. Simulations with empirical force fields become even more critical when studying protein folding pathways and free-energy landscapes, where the experimental data (from, e.g., fluorescence measurements or atomic-force microscopy) provide only fragmentary information of the structural changes or of the conformational space.

Despite the enormous progress in the development of all-atom force fields and efficient simulations techniques, as well as the construction of dedicated machines [12], all-atom molecular dynamics simulations are still restricted to small systems or short time scales. Moreover, extracting the important interactions and functionally important motions from the results of all-atom simulations is not easy and requires some aid from e.g., principal component analysis. Therefore the coarse-grained models, in which several atoms comprising a well-defined object are treated as one unit, become often the only ones with which to carry out large time- and size-scale simulations [7]. To date, most of the coarse-graining approaches are based on statistical potentials (such as, e.g., the CABS protein model developed in the Koliński group [8]); however, these approaches are limited by the completeness of structural databases and abundance of data that can be retrieved from the databases.

Our research group is since long developing the coarse-grained UNRES model of proteins (Figure 1) [15,14,13,26], which enables us to perform simulations by several orders of magnitude faster compared to that of all-atom simulations. At the same time, the corresponding force field has a close connection to the physics of interactions because it is based on the potential of mean force of the system under study in water, which is further expanded into Kubo's cluster cumulants [11], which correspond to the contributions to the effective energy function [15,23]. This feature clearly distinguishes the UNRES force field from other coarse-grained force fields, which are based on statistical potentials [7], are constructed in a neo-classical manner by analogy to all-atom potentials [18] or, as the Gō-like models [4], are designed for a particular protein to make its native structure the global minimum in the potential-energy surface. Use of the cluster-cumulant expansion results, naturally, in multibody terms, which are necessary to handle regular secondary structure properly with the coarse-grained force fields [9,15] and which are obtained in other force field in a heuristic manner [7]. Without using database information, the UNRES model has a relatively high prediction capacity, which was confirmed in the recent Community Wide Experiments on the Critical Assessment of Techniques for Protein Structure Prediction (CASP; predictioncenter.org) [3,10]. This model was also used with success to study protein-folding kinetics [27], protein free-energy landscapes [16] and biological processes [2,19]. Nevertheless, the UNRES model provides only a 5-6 Å resolution for about 60-residue proteins on average and does not always reproduce correct secondary structure.

**Figure:** The UNRES model of polypeptide chains. The united side chains (SC), shown as colored spheroids, are attached to the C^αatoms, and peptide groups (p), shown as blue spheres, are located in the middle between the consecutive C^α atoms. In the proposed project, the flat side chains of tryptophan, tyrosine, and phenylalanine will be represented by 4 or 3 interaction sites, respectively, and C^β atoms will be introduced to handle local interactions better, as illustrated in the Figure. The C^α, shown as small white spheres, are not interaction sites but are the anchor points for the united side chains and the united peptide groups, thus only assisting in the definition of the geometry of the chain, while the coordinates C^β atoms will serve to express the local interactions. All-atom representation is superposed on the coarse-grained model for illustration. The geometry of the simplified chain can be described by the backbone virtual-bond angles θ backbone virtual-bond-dihedral angles γ_i, the azimuthal angles α_i and the zenith angles β_i.

Recently we developed a rigorous approach to the derivation of the analytical expressions for he contributions to the effective energy functions in coarse-grained systems, which is based on expressing the energy by interatomic distances and expressing the distances as functions of the angles for collective rotations the atoms comprising the interaction sites about the virtual-bond axes. The torsional and correlation potentials derived with the use of this formalism are substantially different from those implemented in the present UNRES because they include the dependence on the virtual-bond-angles θ (Fig. 1). The form of this dependence agrees with the statistics derived from the PDB database and explains why the θ angles are large for extended structures and close to 90° for α-helical structures [23]. This observation indicates that a correct representation of local interactions has a critical influence both on local geometry and on the correct description of secondary structure. Therefore, the derivation of correct functional forms of the torsional and correlation potentials should result in a major improvement of both the accuracy of local structure and the reproduction of secondary structure and, consequently, increase the capability of UNRES to predict tertiary structure significantly. Moreover, our recent research demonstrated that the torsional and correlation potentials should be split into backbone and backbone-and-side-chain contributions; the potentials used in the current UNRES encompass the backbone and Cβ atoms, while the side-chain contributions are only partially represented by additional torsional potentials [22]. Replacement of the current functional forms for the torsional and correlation potentials should, therefore, result in major improvement of the accuracy of local structure and to the correct reproduction of secondary structure; consequently, it should significantly enhance the capability of UNRES to model protein structure and dynamics, including database-free prediction of protein structure. Furthermore, the formalism introduced in our recent work [23] enables us to derive physics-based formulas for the local side-chain-interaction potentials (that depend on the side-chain α and β angles of Figure 1) that take into account chirality change, which is important in the simulations of system in which this process takes place (e.g., in αA-crystallin [5], where it is the cause of the senile cataract).

Protein simulations with UNRES are carried out by means of Langevin dynamics. Because the variables are the Cα···Cα and Cα···SC virtual-bond vectors, the full inertia matrix appears in the equations of motion [6], this restricting the size of a system to about 1300 amino-acid residues because the memory required scales with the square of system size. The change of coordinates to Cartesian coordinates of the Cα and Cβ atoms will reduce the inertia matrix to the symmetric pentadiagonal (quindidiagonal) form, the inversion [1] and diagonalization [20] of which requires memory that scales linearly with system size. We have already implemented the cut-off on long-range interactions in UNRES [21]; therefore, with the change of coordinate representation, simulations of large systems will be possible to run.

G. Engeln-Müllges and F. Uhlig. Numerical Algorithms with Fortran. Springer, 1996.
E. I. Golas, G. G. Maisuradze, P. Senet, S. Ołdziej, C. Czaplewski, H. A. Scheraga, and A. Liwo. Simulation of the opening and closing of hsp70 chaperones by coarse-grained molecular dynamics. J. Chem. Theory Comput., 8:1334-1343, 2012.
Y. He, M. A. Mozolewska, P. Krupa, A. K. Sieradzan, T. K. Wirecki, A. Liwo, K. Kachlishvili, S. Rackovsky, D. Jagieła, R. Ślusarz, C. R. Czaplewski, S. Ołdziej, and H. A. Scheraga. Lessons from application of the unres force field to predictions of structures of casp10 targets. Proc. Natl. Acad. Sci. U.S.A., 110:14936-14941, 2013.
R. Hills and C. L. Brooks. Insights from coarse-grained gō-like models for protein folding and dynamics. Int. J. Mols. Sci., 10:889-905, 2009.
M. Y. S. Hooi, M. J. Raftery, and R. J. W. Truscott. Age-dependent racemization of serine residues in a human chaperone protein. Prot. Sci., 22:93-100, 2013.
M. Khalili, A. Liwo, F. Rakowski, P. Grochowski, and H. A. Scheraga. Molecular dynamics with the united-residue model of polypeptide chains. i. lagrange equations of motion and tests of numerical stability in the microcanonical mode. J. Phys. Chem. B, 109:13785-13797, 2005.
S. Kmiecik, D. Gront, M. Kolinski, L. Wieteska, A. E. Dawid, and A. Kolinski. Coarse-grained protein models and their applications. Chem. Rev., 116:7898-7936, 2016.
A. Kolinski. Protein modeling and structure prediction with a reduced representation. Acta Biochim. Pol., 51:349-371, 2004.
A. Kolinski and J. Skolnick. Discretized model of proteins. i. monte carlo study of cooperativity in homopolypeptides. J. Chem. Phys., 97:9412-9426, 1992.
P. Krupa, M. A. Mozolewska, M. Wiśniewska, Y. Yin, Y. He, A. K. Sieradzan, R. Ganzynkowicz, A. G. Lipska, A. Karczyńska, M. Ślusarz, R. Ślusarz, A. Giełdoń, C. Czaplewski, D. Jagieła, B. Zaborowski, H. A. Scheraga, and A. Liwo. Performance of protein-structure predictions with the physics-based unres force field in casp11. Bioinformatics, 32:3270-3278, 2016.
R. Kubo. Generalized cumulant expansion method. J. Phys. Soc. Japan, 17:1100-1120, 1962.
K. Lindorff-Larsen, S. Piana, R. O. Dror, and D. E. Shaw. How fast-folding proteins fold. Science, 334:517-520, 2011.
A. Liwo, M. Baranowski, C. Czaplewski, E. Gołaś, Y He, D. Jagieła, P. Krupa, M. Maciejczyk, M. Makowski, M. A. Mozolewska, A. Niadzvedtski, S. Ołdziej, H. A. Scheraga, A. K. Sieradzan, R. Ślusarz, T. Wirecki, Y. Yin, and B. Zaborowski. A unified coarse-grained model of biological macromolecules based on mean-field multipole.multipole interactions. J. Mol. Model., 20:2306, 2014.
A. Liwo, C. Czaplewski, S. Ołdziej, A. V. Rojas, R. Kaźmierkiewicz, M. Makowski, R. K. Murarka, and H. A. Scheraga. Simulation of protein structure and dynamics with the coarse-grained unres force field. In G. Voth, editor, Coarse-Graining of Condensed Phase and Biomolecular Systems, chapter 8, pages 1391-1411. Taylor & Francis Group, LLC, 2008.
A. Liwo, C. Czaplewski, J. Pillardy, and H. A. Scheraga. Cumulant-based expressions for the multibody terms for the correlation between local and electrostatic interactions in the united-residue force field. J. Chem. Phys., 115:2323-2347, 2001.
G. G. Maisuradze, P. Senet, C. Czaplewski, A. Liwo, and H. A. Scheraga. Investigation of protein folding by coarse-grained molecular dynamics with the unres force field. J. Phys. Chem. A, 114:4471-4485, 2010.
John B.O. Mitchell and James Smith. D-amino acid residues in peptides and proteins. Proteins: Struct. Func. Bionif., 50(4):563-571, 2003.
L. Monticelli, S. K. Kandasamy, X. Periole, R. G. Larson, D. P. Tieleman, and S. J. Marrink. The martini coarse-grained force field: Extension to proteins. J. Chem. Theor. Comput., 4:819-834, 2008.
M. A. Mozolewska, P. Krupa, H. A. Scheraga, and A. Liwo. Molecular modeling of the binding modes of the iron-sulfur protein to the jac1 co-chaperone from saccharomyces cerevisiae by all-atom and coarse-grained approaches. Proteins: Struct. Funct. Bioinfo., 83:1414-1426, 2015.
W. A. Sentance and I. P. Cliff. The determination of eigenvalues of symmetric quindiagonal matrices. Computer J., 24:177-179, 1981.
A. K. Sieradzan. Introduction of periodic boundary conditions into unres force field. J. Comput. Chem., 36:940-946, 2015.
A. K. Sieradzan, P. Krupa, H. A. Scheraga, A. Liwo, and C. Czaplewski. Physics-based potentials for the coupling between backbone- and side-chain-local conformational states in the united residue (unres) force field for protein simulations. J. Chem. Theory Comput., 11:817-831, 2015.
A. K. Sieradzan, M. Makowski, A. Augustynowicz, and A. Liwo. A general method for the derivation of the functional forms of the effective energy terms in coarse-grained energy functions of polymers. i. backbone potentials of coarse-grained polypeptide chains. J. Chem. Phys., 146:124106, 2017.
A. Tramontano. The ten most wanted solutions in protein bioinformatics. Taylor & Francis, Boca Raton, 2005.
R. J. W. Truscott, K. L. Schey, and M. G. Friedrich. Old proteins in man: a field in its infancy. Trends in Biochemical Sciences, 41:654-664, 2016.
B. Zaborowski, D. Jagieła, C. Czaplewski, A. Hałabis, A. Lewandowska, W. Żłmudzińska, S. Ołdziej, A. Karczyńska, C. Omieczynski, T. Wirecki, and A. Liwo. A maximum-likelihood approach to force-field calibration. J. Chem. Inf. Model., 55:2050-2070, 2015.
R. Zhou, G. G. Maisuradze, D. Sunol, T. Todorovski, M. J. Macias, Y. Xiao, H. A. Scheraga, C. Czaplewski, and A. Liwo. Folding kinetics of ww domains with the united residue force field for bridging microscopic motions and experimental measurements. Proc. Natl. Acad. Sci. U.S.A., 111:18243-18248, 2014.

Go to main page