Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
© Copyright American Institute of Physics. This article may be downloaded for personal use only. Any other use requires prior permission of the author and the American Insitute of Physics THE JOURNAL OF CHEMICAL PHYSICS 126, 244111 共2007兲 Dihedral angle principal component analysis of molecular dynamics simulations Alexandros Altis, Phuong H. Nguyen, Rainer Hegger, and Gerhard Stocka兲 Institute of Physical and Theoretical Chemistry, J. W. Goethe University, Max-von-Laue-Strasse 7, D-60438 Frankfurt, Germany 共Received 14 March 2007; accepted 11 May 2007; published online 29 June 2007兲 It has recently been suggested by Mu et al. 关Proteins 58, 45 共2005兲兴 to use backbone dihedral angles instead of Cartesian coordinates in a principal component analysis of molecular dynamics simulations. Dihedral angles may be advantageous because internal coordinates naturally provide a correct separation of internal and overall motion, which was found to be essential for the construction and interpretation of the free energy landscape of a biomolecule undergoing large structural rearrangements. To account for the circular statistics of angular variables, a transformation from the space of dihedral angles 兵n其 to the metric coordinate space 兵xn = cos n , y n = sin n其 was employed. To study the validity and the applicability of the approach, in this work the theoretical foundations underlying the dihedral angle principal component analysis 共dPCA兲 are discussed. It is shown that the dPCA amounts to a one-to-one representation of the original angle distribution and that its principal components can readily be characterized by the corresponding conformational changes of the peptide. Furthermore, a complex version of the dPCA is introduced, in which N angular variables naturally lead to N eigenvalues and eigenvectors. Applying the methodology to the construction of the free energy landscape of decaalanine from a 300 ns molecular dynamics simulation, a critical comparison of the various methods is given. © 2007 American Institute of Physics. 关DOI: 10.1063/1.2746330兴 I. INTRODUCTION Classical molecular dynamics 共MD兲 simulations have become a popular and powerful method in describing the structure, dynamics, and function of biomolecules in microscopic detail.1 As MD simulations produce a considerable amount of data 共i.e., 3M coordinates of all M atoms for each time step兲, there has been an increasing interest to develop methods to extract the “essential” information from the trajectory. For example, one often wants to represent the molecule’s free energy surface 共the “energy landscape”2–4兲 as a function of a few important coordinates 共the “reaction coordinates”兲, which describe the essential physics of a biomolecular process such as protein folding or molecular recognition. The reduction of the dimensionality from 3M atom coordinates to a few collective degrees of freedom is therefore an active field of theoretical research.5–28 Principal component analysis5 共PCA兲, also called quasiharmonic analysis or essential dynamics method,6–9 is one of the most popular methods in systematically reducing the dimensionality of a complex system. The approach is based on the covariance matrix, which provides information on the two-point correlations of the system. The PCA represents a linear transformation that diagonalizes the covariance matrix and thus removes the instantaneous linear correlations among the variables. Ordering the eigenvalues of the transformation decreasingly, it has been shown that a large part of a兲 Electronic mail: stock@theochem.uni-frankfurt.de 0021-9606/2007/126共24兲/244111/10/$23.00 the system’s fluctuations can be described in terms of only a few principal components, which may serve as reaction coordinates.6–12 Recently, it has been suggested to employ internal 共instead of Cartesian兲 coordinates in a PCA.13–19 In biomolecules, in particular, the consideration of dihedral angles appears appealing, because other internal coordinates such as bond lengths and bond angles usually do not undergo changes of large amplitudes. Studying the reversible folding and unfolding of pentaalanine in explicit water, Mu et al.17 showed that a PCA using Cartesian coordinates did not yield the correct rugged free energy landscape due to an artifact of the mixing of internal and overall motion. As internal coordinates naturally provide a correct separation of internal and overall dynamics, they proposed a method, referred to as dihedral angle principal component analysis 共dPCA兲, which is based on the dihedral angles 共n , n兲 of the peptide backbone. To avoid the problems arising from the circularity of these variables, a transformation from the space of dihedral angles 兵n其 to a linear metric coordinate space 共i.e., a vector space with the usual Euclidean distance兲 was built up by the trigonometric functions sin n and cos n. In a recent comment29 to Ref. 17, the concern was raised that the dPCA method may lead to spurious results because of the inherent constraints 共sin2 n + cos2 n = 1兲 of the formulation. While it is straightforward to show that the problem described in Ref. 29 was caused by numerical artifacts due to insufficient sampling,30 the discussion nevertheless demonstrates the need for a thorough general analysis of the dPCA. In this work, we present a comprehensive account of 126, 244111-1 © 2007 American Institute of Physics Downloaded 20 Jul 2007 to 141.2.216.130. Redistribution subject to AIP license or copyright, see http://jcp.aip.org/jcp/copyright.jsp © Copyright American Institute of Physics. This article may be downloaded for personal use only. Any other use requires prior permission of the author and the American Insitute of Physics 244111-2 J. Chem. Phys. 126, 244111 共2007兲 Altis et al. various theoretical issues underlying the dPCA method. We start with a brief introduction to the circular statistics of angle variables, discuss the transformation from an angle to the unit circle proposed in Ref. 17, and demonstrate that the transformation amounts to a one-to-one representation of the original angle distribution. Adopting the 共 , 兲 distribution of trialanine as a simple but nontrivial example, the properties of the dPCA are discussed in detail. In particular, it is shown that in this case the dPCA results are equivalent to the results of a Cartesian PCA and that the dPCA eigenvectors may be characterized in terms of the corresponding conformational changes of the peptide. Furthermore, we introduce a complex-valued version of the dPCA, which provides new insights on the PCA of circular variables. Adopting a 300 ns MD simulation of the folding of decaalanine, we conclude with a critical comparison of the various methods. II. CIRCULAR STATISTICS Dihedral angles 苸 关0 ° , 360° 关 represent circular 共or directional兲 data.31 Unlike the case of regular data x 苸 兴 − ⬁ , ⬁关, the definition of a metric is not straightforward, which makes it difficult to calculate distances or means. For example, the regular data x1 = 10 and x2 = 350 clearly give ⌬x = 兩x2 − x1兩 = 340 and 具x典 = 共10+ 350兲 / 2 = 180. A visual inspection of the corresponding angles 1 = 10° and 2 = 350°, on the other hand, readily shows that ⌬ = 20° ⫽ 兩2 − 1兩 and 具典 = 0 ° ⫽ 共1 + 2兲 / 2. To recover the standard rules of calculating distances and the mean, we may assume that 苸 关−180° , 180° 关. Then 1 = 10° and 2 = −10°, and we obtain ⌬ = 兩2 − 1兩 = 20° and 具典 = 共1 + 2兲 / 2 = 0°. This example manifests the general property that, if the range of angles covered by the data set is smaller than 180°, we may simply shift the origin of the angle coordinates to the middle of this range and perform standard statistics. The situation is more involved for “true” circular data whose range exceeds 180°. This is the case for folding biomolecules, since the angle of the peptide backbone is typically distributed as ␣ ⬇ −60° ± 30° 共for ␣R helical conformations兲 and  ⬇ 140° ± 30° 共for  extended conformations兲. If the values of the angles can be described by a normal distribution, one may employ the von Mises distribution,31 which represents the circular statistics’ equivalent of the normal distribution for regular data. However, this method is not applicable to the description of conformational transitions, since the corresponding dihedral angle distributions can only be typically described by multipeaked probability densities. A general approach to circular statistics is obtained by representing the angle by its equivalent vector 共x , y兲 on the unit circle. This amounts to the transformation 哫 再 x = cos y = sin . 冎 共1兲 Unlike the periodic range of the angle coordinate , the vectors 共x , y兲 are defined in a linear space, which means that we can define the usual Euclidean metric ⌬2 = 共x1 − x2兲2 + 共y 1 − y 2兲2 between any two vectors 共x1 , y 1兲T and 共x2 , y 2兲T. The distance of two angles with an actual small distance, e.g., 1 = 179° and 2 = −179°, is given by a small ⌬ in the 共x , y兲 space, since the corresponding vectors lie close on the unit circle. Hence, the problem of periodicity is circumvented. Furthermore, the vector representation of the angles allows us to unambiguously calculate mean values and other quantities. For example, to evaluate the mean of the angles n, one simply calculates the sum of the corresponding vector components and then determines the mean angle by31 tan具典 = 具y典/具x典 = 兺n sin n . 兺n cos n 共2兲 Although the vector representation of angles in Eq. 共1兲 appears straightforward and intuitively appealing, it has the peculiar property of doubling the variables: Given N angle coordinates n, we obtain 2N Cartesian-type coordinates 共xn , y n兲. In the example given in Eq. 共2兲, this does not lead to any problems, because in the end of the calculation we are able to calculate back from the averaged vector coordinates to the original angle coordinate, that is, the correctly averaged angle. Since Eq. 共1兲 represents a nonlinear transformation, however, we will see that obtaining the peptide’s angles in a direct way after a dPCA treatment of the data is not possible in general 共see below兲. In this case, a subsequent analysis needs to be performed. Having to employ these coordinates for the description of peptide energy landscapes in mind, the question of whether the resulting representation preserves the characteristics of the original energy landscapes arises. In particular, it is of interest if the number and structure of minima and transition states are preserved in the 2N-dimensional 共xn , y n兲 space. To answer these questions and to illustrate the properties of transformation 共1兲, we consider a simple onedimensional example described by the angular probability density 关see Fig. 1共a兲兴, 共兲 = 1 共1 − cos 4兲, 2 共3兲 with 苸 关−180° , 180° 关. By construction, the density exhibits four maxima at = ± 45° , ± 135°. Employing transformation 共1兲, we obtain the corresponding probability density on a circle of unit radius, 共x,y兲 = 8x2共1 − x2兲 2 2 ␦共x + y − 1兲. 共4兲 The density plot of 共x , y兲 displayed in Fig. 1共b兲 demonstrates that transformation 共1兲 simply wraps the angular density 共兲 around the circumference of the unit circle. Hence, all features of 共兲 are faithfully represented by 共x , y兲, particularly the number and the structure of extrema. This is a consequence of the fact that transformation 共1兲 is a bijection, which uniquely assigns each angle a corresponding vector 共x , y兲 and vice versa. We observe that this desirable feature is not obtained if we transform to only a single Cartesian-type variable, x or y. The corresponding densities, 共x兲 = 8x2冑1 − x2 , 共5兲 Downloaded 20 Jul 2007 to 141.2.216.130. Redistribution subject to AIP license or copyright, see http://jcp.aip.org/jcp/copyright.jsp © Copyright American Institute of Physics. This article may be downloaded for personal use only. Any other use requires prior permission of the author and the American Insitute of Physics 244111-3 J. Chem. Phys. 126, 244111 共2007兲 Dihedral angle principal component analysis dimensional data set.5 In the case of molecular dynamics of M atoms, the basic idea is that the correlated internal motions are represented by the covariance matrix, ij = 具共qi − 具qi典兲共q j − 具q j典兲典, 共7兲 where q1 , . . . , q3M are the mass-weighted Cartesian coordinates of the molecule and 具…典 denotes the average over all sampled conformations.6–9 By diagonalizing the covariance matrix we obtain 3M eigenvectors v共i兲 and eigenvalues i, which are rank ordered descendingly, i.e., 1 represents the largest eigenvalue. The eigenvectors and eigenvalues of yield the modes of collective motion and their amplitudes, respectively. The principal components, Vi = v共i兲 · q, 共8兲 of the data q = 共q1 , . . . , q3M 兲T can be used, for example, to represent the free energy surface of the system. Restricting ourselves to two dimensions, we obtain ⌬G共V1,V2兲 = − kBT关ln 共V1,V2兲 − ln max兴, FIG. 1. 共A兲 Angular density 共兲 = 共1 / 2兲共1 − cos 4兲. 共B兲 Representation of 共兲 through its probability density 共x , y兲 on the unit circle 共artificial width added for a better visualization兲. Also shown are the densities 共x兲 and 共y兲, which display the angular densities along the single Cartesian-type variables x and y, respectively. Note that only 共x , y兲 reproduces the correct number of extrema of 共兲. 共y兲 = 8y 2冑1 − y 2 , 共6兲 are also shown in Fig. 1共b兲. As a consequence of the projection onto the x or y axis, each density exhibits only two instead of four maxima. The above described properties of the one-dimensional example readily generalize to the N-dimensional case, n 哫 共xn , y n兲. In direct generalization of the unit circle, the data points 共xn , y n兲 are distributed on the surface of a 2N-dimensional sphere with radius 冑N. This is because the distance of every data point 共x1 , y 1 , . . . , xN , y N兲 to the origin equals 共x21 + y 21 + ¯ + x2N + y 2N兲1/2 = 共1 + ¯ + 1兲1/2 = 冑N. Since the transformation represents a bijection, there is a one-toone correspondence between states in the N-dimensional angular space and in the 2N-dimensional vector space. Again, the Euclidean metric of the 2N-dimensional vector space guarantees that mean values and other quantities can be calculated easily. We note in passing that, alternatively to transformation 共1兲, one may employ a complex representation zn = ein of the angles. As Euler’s formula ei = cos + i sin provides a direct correspondence between the 2N-dimensional real vectors 共x1 , y 1 , . . . , xN , y N兲T and the N-dimensional complex vectors 共z1 , . . . , zN兲T, all considerations performed above can also be done using the complex representation. We will explore this idea in more detail in Sec. VI. III. DIHEDRAL ANGLE PRINCIPAL COMPONENT ANALYSIS „dPCA… Principal component analysis 共PCA兲 is a wellestablished method in reducing the dimensionality of a high- 共9兲 where is an estimate of the probability density function obtained from a histogram of the data. max denotes the maximum of the density, which is subtracted to ensure that ⌬G = 0 for the lowest free energy minimum. The basic idea of the dPCA proposed in Ref. 17 is to perform the PCA on sin- and cos-transformed dihedral angles, q2n−1 = cos n , q2n = sin n , 共10兲 where n = 1 , . . . , N and N is the total number of peptide backbone and side-chain dihedral angles used in the analysis. Hence the covariance matrix 关Eq. 共7兲兴 of the dPCA uses 2N variables qn. The question then is whether the combination of the nonlinear transformation 关Eq. 共10兲兴 and the subsequent PCA still gives a unique and faithful representation of the initial angular data n. Let us first consider the above discussed example of a one-dimensional angular density 共兲 = 共1 / 2兲共1 − cos 4兲, which is mapped via transformation 共10兲 on the twodimensional density on the unit circle 共x , y兲 = 关8x2共1 − x2兲 / 兴␦共x2 + y 2 − 1兲, where x = q1 = cos and y = q2 = sin . Since in this case 具x典 = 具y典 = 具xy典 = 0 and 具x2典 = 具y 2典 = 21 , we find that 1 the covariance matrix is diagonal with 11 = 22 = 2 . That is, 1 we have degenerate eigenvalues 1/2 = 2 and may choose any two orthonormal vectors as eigenvectors. Choosing, e.g., the unit vectors ex and ey, the PCA leaves the density 共x , y兲 invariant, which—as discussed above—is a unique and faithful representation of the initial angular density 共兲. In general, one does not obtain a diagonal covariance matrix for a one-dimensional angular density 共兲 关e.g., for 共兲 = 1 / 2 + 91 cos共兲 + 91 sin共兲 we obtain 12 = −2 / 81⫽ 0兴. A sufficient condition for a diagonal covariance matrix for an N-dimensional angular density is that the latter factorizes 共 1 , . . . , N兲 in one-dimensional densities 关i.e., = 共1兲共2兲 ¯ 共N兲兴 and that 具cos n典 = 0 or 具sin n典 = 0 for all n = 1 , . . . , N. In these trivial cases, the dPCA method simply reduces to transformation 共10兲. Downloaded 20 Jul 2007 to 141.2.216.130. Redistribution subject to AIP license or copyright, see http://jcp.aip.org/jcp/copyright.jsp © Copyright American Institute of Physics. This article may be downloaded for personal use only. Any other use requires prior permission of the author and the American Insitute of Physics 244111-4 Altis et al. J. Chem. Phys. 126, 244111 共2007兲 FIG. 2. 共Color兲 共A兲 Ramachandran 共 , 兲 probability distribution of Ala3 in water as obtained from a 100 ns MD simulation. Performing a dPCA, the resulting free energy landscape along the first two principal components is shown in 共B兲; the 共 , 兲 distributions pertaining to the labeled energy minima is shown in 共C兲. Panels 共D兲 and 共E兲 show the corresponding results obtained for a Cartesian PCA. Panel 共F兲 displays the 共1 , 2兲 distribution obtained from the complex dPCA. IV. A SIMPLE EXAMPLE The simplest nontrivial case of a dPCA occurs for a twodimensional correlated angular density. As an example, we adopt trialanine whose conformation can be characterized by a single pair of 共 , 兲 backbone dihedral angles. Trialanine 共Ala3兲 in aqueous solution is a model peptide which has been the subject of numerous experimental32–35 and computational36–38 studies. To generate the angular distribution of 共 , 兲 of trialanine, we performed a 100 ns MD simulation at 300 K. We used the GROMACS program suite,39,40 the GROMOS96 force field 43a1,41 the simple point charge 共SPC兲 water model,42 and a particle-mesh Ewald43 treatment of the electrostatics. Details of the simulation can be found in Ref. 37. Figure 2共a兲 shows the 共 , 兲 distribution obtained from the simulation, which predicts that mainly three conformational states are populated: the right-handed helix conformation ␣R 共15%兲, the extended conformation  共39%兲, and the poly-L-proline II 共PII兲 helixlike conformation 共42%兲. Although recent experimental data35 indicate that the simulation overestimates the populations of ␣R and , we nevertheless adopt the MD data as a simple yet nontrivial example to illustrate the performance of the dPCA method. Performing the dPCA on the 共 , 兲 data, we consider the four variables q1 = cos , q2 = sin , q3 = cos , and q4 = sin . Diagonalization of the resulting covariance matrix yields four principal components V1 , . . . , V4, which contribute 51%, 24%, 15%, and 10% to the overall fluctuations of the system, respectively. To characterize the principal components, Fig. 3 shows their one-dimensional probability densities. Only the first two distributions are found to exhibit multiple peaks, while the other two are approximately unimodal. Hence we may expect that the conformational states shown by the angular distribution of 共 , 兲 in Fig. 2共a兲 can be accounted for by the first two principal components. If we assume that V1 and V2 are independent 关i.e., 共V1 , V2兲 = 共V1兲共V2兲兴, the three peaks found for 共V1兲 as well as for 共V2兲 give rise to 3 ⫻ 3 = 9 peaks of 共V1 , V2兲. To identify possible correlations, Fig. 2共b兲 shows the twodimensional density along the first two principal components. For the sake of better visibility, we have chosen a logarithmic representation, thus showing the free energy landscape 关Eq. 共9兲兴 of the system. The figure exhibits three 共instead of nine兲 well-defined minima labeled S1, S2, and S3, revealing that the first two principal components are indeed strongly dependent. To identify the corresponding three conformational states, we have back-calculated the 共 , 兲 distributions of the minima from the trajectory.44 As shown in Fig. 2共c兲 as well as by Table I, the minima S1, S2, and S3 clearly correspond to PII, , and ␣R, respectively. A closer analysis reveals that fine details of the conformational distribution can also be discriminated by the first two principal components. For example, the shoulder on the left side of the ␣R state in Fig. 2共a兲 corresponds to the region around V2 ⬇ −0.9 of the S3 minimum. Moreover, the minor 共3%兲 population of the left-handed helix conformation ␣L at ⬇ 60° corresponds to the small orange region 共outside of the square兲 of the S1 minimum. It is instructive to compare the above results obtained by FIG. 3. 共Color online兲 Probability densities of the four principal components obtained from the sin/cos 共full lines兲 and the complex 共dashed lines兲 dPCA of trialanine, respectively. Downloaded 20 Jul 2007 to 141.2.216.130. Redistribution subject to AIP license or copyright, see http://jcp.aip.org/jcp/copyright.jsp © Copyright American Institute of Physics. This article may be downloaded for personal use only. Any other use requires prior permission of the author and the American Insitute of Physics 244111-5 J. Chem. Phys. 126, 244111 共2007兲 Dihedral angle principal component analysis TABLE I. Conformational states PII, , and ␣R of trialanine in water, characterized by their population probability P and the average dihedral angles 共 , 兲. The results from the dPCA and the Cartesian PCA are compared to reference data obtained directly from the MD simulation. MD data dPCA Cartesian PCA State P 共%兲 共 , 兲 共deg兲 P 共%兲 共 , 兲 共deg兲 P 共%兲 共 , 兲 共deg兲 PII  ␣R 42 39 15 −67, 132 −121, 131 −75, −45 45 40 16 −63, 131 −121, 131 −74, −46 47 38 16 −64, 132 −122, 130 −75, −46 the dPCA to the outcome of a standard PCA using Cartesian coordinates. Restricting the analysis to the atoms CONH – CHCH3 – CONH around the central 共 , 兲 dihedral angles of trialanine, the first four principal components contribute 47%, 28%, 15%, and 8% to the overall fluctuations, respectively, and exhibit one-dimensional probability densities that closely resemble the ones obtained by the dPCA 共data not shown兲. Figure 2共d兲 shows the resulting free energy surface along the first two principal components, which looks quite similar to the dPCA result. The three minima S1⬘, S2⬘, and S3⬘ are identified in Fig. 2共e兲 as the conformational states PII, , and ␣R. Again, the details of the conformational distribution such as the ␣L state are also resolved by the first two principal components. In summary, it has been shown that both the Cartesian PCA and the dPCA reproduced the correct conformational distribution of the MD trajectory of trialanine. In both cases, the first two principal components were sufficient to resolve most details. Although only four coordinates were used, the dPCA was found to be equivalent to the Cartesian PCA using 33 coordinates. V. INTERPRETATION OF EIGENVECTORS In the simple example above, Fig. 2 demonstrates that the first two principal components V1 and V2 共or, equivalently, the first two eigenvectors v共1兲 and v共2兲兲 are associated with motions along the and the dihedral angles, respectively. In the case of the Cartesian PCA, the structural changes of the molecule along the principal components are readily illustrated, even for high-dimensional systems. From 共i兲 共i兲 共i兲 Vi = v共i兲 · q = v共i兲 1 q1 + v2 q2 + v3 q3 + . . . + v3M−2q3M−2 共k兲 2 共k兲 2 ⌬共k兲 1 = 共v1 兲 + 共v2 兲 共13兲 as a measure of the influence of angle 1 on the principal 共k兲 component Vk 共and similarly ⌬共k兲 2 , . . . , ⌬N for the other angles兲. The definition implies that 兺n⌬共k兲 n = 1, since the length of each eigenvector is 1. Hence ⌬共k兲 can be considered n as the percentage of the effect of the angle n on the principal component Vk. Furthermore, Eq. 共12兲 assures that only structural rearrangements along angles with nonzero ⌬共k兲 n may change the value of Vk. To demonstrate the usefulness of definition 共13兲, we again invoke our example of trialanine with angles 共n = 1兲 and 共n = 2兲 and consider the quantities ⌬共k兲 n describing the effect of these angles on the four principal components 共k = 1 , . . . , 4兲, see Fig. 4. We clearly see that the dihedral angle has almost no influence on V1 共⌬共1兲 1 ⬇ 0兲, whereas has a 共1兲 very large one 共⌬2 ⬇ 1兲. As a consequence, the first principal component allows us to separate conformations with a different angle but does not separate conformations which differ in . Indeed, Fig. 2共b兲 reveals that V1 accounts essentially for the ␣ ↔  / PII transition along , but hardly separates conformations with different , such as  and PII. Considering the second principal component V2, we obtain ⌬共2兲 1 ⬇ 1 and ⌬共2兲 ⬇ 0. This is again in agreement with Fig. 2共b兲, 2 which shows that the second principal component accounts essentially for transitions along . Recalling that V1, V2, V3, and V4 contribute 51%, 24%, 15%, and 10% to the overall fluctuations, respectively, the  ↔ PII transitions described by the second principal component represent a much smaller conformational change than the ␣ ↔  / PII transitions described by V1. Similarly, although the ⌬共k兲 n of the third and 共i兲 + v3M−1 q3M−1 + v共i兲 3M q3M , 共i兲 共i兲 we see that, e.g., the first three components v共i兲 1 , v2 , and v3 共i兲 of the eigenvector v simply reflect the influence of the x, y, and z coordinates of the first atom on the ith principal component. Hence, 共i兲 2 共i兲 2 共i兲 2 ⌬共i兲 1 = 共v1 兲 + 共v2 兲 + 共v3 兲 共11兲 is a suitable measure of this influence. The quantities 共i兲 ⌬共i兲 2 , . . . , ⌬ M are defined analogously. In the dPCA, the principal components are given by 共k兲 Vk = v共k兲 · q = v共k兲 1 cos 1 + v2 sin 1 共k兲 + . . . + v2N−1 cos N + v共k兲 2N sin N . In direct analogy to Eq. 共11兲, we may define 共12兲 FIG. 4. 共Color online兲 Influence of the dihedral angles 共black bars兲 and 共gray bars兲 on the principal component Vk 共k = 1 , . . . , 4兲 of the cos/sin dPCA 共k兲 of trialanine. Shown are the quantities ⌬共k兲 1 共for 兲 and ⌬2 共for 兲 defined in Eq. 共13兲, representing the percentage of the effect of the two dihedral angels on Vk. Also shown are the contributions 共in %兲 of each principal component to the overall fluctuations of the system. Downloaded 20 Jul 2007 to 141.2.216.130. Redistribution subject to AIP license or copyright, see http://jcp.aip.org/jcp/copyright.jsp © Copyright American Institute of Physics. This article may be downloaded for personal use only. Any other use requires prior permission of the author and the American Insitute of Physics 244111-6 J. Chem. Phys. 126, 244111 共2007兲 Altis et al. fourth principal components are quite similar to the previous ones, they only account for fluctuations within a conformational state and are therefore of minor importance in a conformational analysis. Re Wn = Vkn, Im Wn = Vk⬘ , n 共18兲 and the union of the indices kn , kn⬘ gives the complete set 兵1 , . . . , 2N其. Moreover, the eigenvalues n of the complex dPCA are given by the sum of the two corresponding eigenvalues kn and k⬘ of the sin/cos dPCA, n n = kn + k⬘ . VI. COMPLEX DPCA Alternatively to the sin/cos transformation in Eq. 共10兲 which maps N angles on 2N real numbers, one may also transform from the angles n to the complex numbers z n = e in 共n = 1, . . . ,N兲, 共14兲 which give an N-dimensional complex vector z = 共z1 , z2 , . . . , zN兲T. In what follows, we develop a dPCA based on this complex data 共“complex dPCA”兲 and discuss its relation to the real-valued dPCA 共“sin/cos dPCA”兲 considered above. The covariance matrix pertaining to the complex variables zn is defined as Cmn = 具共zm − 具zm典兲共z*n − 具z*n典兲典, 共15兲 with m , n = 1 , . . . , N, and z* being the complex conjugate of z. Being in principle an observable quantity, C is a Hermitian matrix with N real-valued eigenvalues n and N complex eigenvectors w共n兲, Cw共n兲 = nw共n兲 , 共16兲 where the eigenvectors are unique up to a phase 0. We define the complex principal components to be T Wn = w共n兲 z = rnei共n+0兲 , 共19兲 n 共17兲 where we use vector-vector multiplication instead of a Hermitian inner product 共see Appendix for details兲. Two nice features of the complex dPCA are readily evident. First, the complex representation of N angular variables directly results in N eigenvalues and eigenvectors; that is, there is no doubling of variables as in the sin/cos dPCA. Second, the representation of the complex principal components by their weights rn and angles n in Eq. 共17兲 may facilitate their direct interpretation in terms of simple physical variables. From Euler’s formula ei = cos + i sin , one would expect an evident correspondence between the sin/cos and the complex dPCA. That is, there should be a relation between the N complex eigenvectors w共n兲 and the 2N real eigenvectors v共k兲. Furthermore, the N real eigenvalues n of the complex dPCA should be related to the 2N real eigenvalues k of the sin/cos dPCA. However, this general correspondence turned out to be less obvious than expected 共see Appendix兲, and we were only able to find an analytical relation in some limiting cases. In these cases, one indeed may construct suitably normalized eigenvectors w共n兲 such that the real and imaginary parts of the resulting principal components Wn of the complex dPCA are equal to the 2N principal components Vk of the sin/cos dPCA. In other words, for every n 苸 兵1 , . . . , N其 there are two indices kn , kn⬘ 苸 兵1 , . . . , 2N其 such that Apart from the limiting cases of completely uncorrelated and completely correlated variables, we could not establish general conditions under which Eqs. 共18兲 and 共19兲 hold. Empirically, Eq. 共19兲 was always satisfied, while Eq. 共18兲 was found to hold in many 共but not all兲 cases under consideration, see Figs. 3 and 7 below. We note that even in numerical studies it may be cumbersome to establish the correspondences, since the accuracy of Eqs. 共18兲 and 共19兲 depends on the number of data points one uses to calculate the covariance matrices in both methods, i.e., on the overall sampling of the MD trajectory. To demonstrate the performance of the complex dPCA, we first apply it to the above discussed example of trialanine. Comparing the 2N = 4 eigenvalues of the sin/cos dPCA 1 , . . . , 4 to the two eigenvalues 1 and 2 of the complex dPCA, we obtain 1 = 0.630 = 0.489 + 0.141 = 1 + 3 , 2 = 0.338 = 0.237 + 0.101 = 2 + 4 , that is, Eq. 共19兲 is fulfilled. Choosing suitable normalization constants 0 for the complex eigenvectors, we furthermore find the correspondence Re W1 ⬇ V1, Re W2 ⬇ V2 , Im W1 ⬇ V3, Im W2 ⬇ V4 . As shown by the probability densities of the principal components in Fig. 3, both formulations lead to virtually identical principal components. Finally, it is interesting to study if the representation of the complex principal components by their weights rn and angles n in Eq. 共17兲 facilitates their interpretation. In the case of our trialanine data, it turns out that the weights are approximately constant, i.e., r1 ⬇ r2 ⬇ 1. Hence, the probability distribution of the two angles 共1 , 2兲 contains all the conformational fluctuations of the data. Indeed, Fig. 2 reveals that 共1 , 2兲 is almost identical to the original 共 , 兲 density from the MD simulation. In this simple case, the complex dPCA has obviously managed to completely identify the underlying structure of the data. VII. ENERGY LANDSCAPE OF DECAALANINE We finally wish to present an example which demonstrates the potential of the dPCA method to represent the true multidimensional energy landscape of a folding biomolecule. Following earlier work on the folding of alanine peptides,17,28,35 we choose decaalanine 共Ala10兲 in aqueous solution. Employing similar conditions as in the case of trialanine described above 共GROMOS96 force field 43a1,41 SPC wa- Downloaded 20 Jul 2007 to 141.2.216.130. Redistribution subject to AIP license or copyright, see http://jcp.aip.org/jcp/copyright.jsp © Copyright American Institute of Physics. This article may be downloaded for personal use only. Any other use requires prior permission of the author and the American Insitute of Physics 244111-7 Dihedral angle principal component analysis J. Chem. Phys. 126, 244111 共2007兲 FIG. 5. 共Color兲 Free energy landscapes of Ala10 in water as obtained from a 300 ns MD simulation. The first column, 共A兲 and 共B兲, shows the results along the first four principal components obtained from a Cartesian PCA, the second column, 共C兲 and 共D兲, the corresponding landscapes calculated from the sin/cos dPCA. Panels 共E兲–共H兲 display the landscapes along the angles 共1 , 2兲 and 共3 , 4兲 and the weights 共r1 , r2兲 and 共r3 , r4兲 of the complex dPCA, respectively. ter model,42 and particle-mesh Ewald43 treatment of the electrostatics兲, we ran a 300 ns trajectory of Ala10 at 300 K and saved every 0.4 ps the coordinates for analysis. Let us first consider the free energy landscape ⌬G 关Eq. 共9兲兴 obtained from a PCA using all Cartesian coordinates of the system. The calculations of ⌬G共V1 , V2兲 and ⌬G共V3 , V4兲 presented in Figs. 5共a兲 and 5共b兲 show that the resulting energy landscape is rather unstructured and essentially single peaked, indicating a single folded state and a random ensemble of unfolded conformational states. However, as discussed in detail in Ref. 17 for the case of Ala5, this smooth appearance of the energy landscape in the Cartesian PCA merely represents an artifact of the mixing of internal and overall motion. This becomes clear when a sin/cos dPCA of the N = 18 inner backbone dihedral angles 兵n其 = 兵1 , 2 , 2 , . . . , 9 , 9 , 10其 is performed. The resulting dPCA free energy surfaces ⌬G共V1 , V2兲 and ⌬G共V3 , V4兲 shown in Figs. 5共c兲 and 5共d兲 exhibit numerous wellseparated minima, which correspond to specific conformational structures. By back-calculating from the dPCA free energy minima to the underlying backbone dihedral angles of all residues,44 we are able to discriminate and characterize 15 such states.45 The most populated ones are the all ␣R helical conformation 共8%兲, a state 共15%兲 with the inner seven residues in ␣R 共and the remaining residues in  / PII兲, and two states 共8% each兲 with six inner residues in ␣R. Well-defined conformational states are also found in the unfolded part of the free energy landscape, revealing that the unfolded state of decaalanine is rather structured than random. To obtain an interpretation of the kth principal component in terms of the dihedral angles n, Fig. 6 shows the quantities ⌬共k兲 n defined in Eq. 共13兲 which describe the effect of these angles on the first two principal components. The first principal component V1 is clearly dominated by motion along the angles 共gray bars兲, while fluctuations of the angles 共black bars兲 hardly contribute. Hence, going along V1 we will find conformations which mainly differ in angles. Considering the second principal component V2, we find a dominant ⌬共2兲 n for the angle 3 共and a smaller value for 9兲, revealing that V2 mainly separates conformation that differ in 3. Similarly, the ⌬共k兲 n obtained for the next few principal components are dominated by the contribution of a single 共4兲 共5兲 共6兲 angle. For example, we find that ⌬共3兲 n , ⌬n , ⌬n , and ⌬n depend mostly on the angles 2, 9, 4 共and 8兲, and 5, FIG. 6. 共Color online兲 Influence of the 18 inner backbone dihedral angles 兵n其 = 兵1 , 2 , 2 , . . . , 9 , 9 , 10其 on the first two principal components V1 and V2 of the sin/cos dPCA of Ala10. Shown are the quantities ⌬共1兲 n 共for V1兲 and ⌬共2兲 n 共for V2兲 defined in Eq. 共13兲, representing the percentage of the effect of the dihedral angles on Vk. The black and gray bars correspond to the and angles, respectively. Also shown are the contributions 共in %兲 of each principal component to the overall fluctuations of the system. Downloaded 20 Jul 2007 to 141.2.216.130. Redistribution subject to AIP license or copyright, see http://jcp.aip.org/jcp/copyright.jsp © Copyright American Institute of Physics. This article may be downloaded for personal use only. Any other use requires prior permission of the author and the American Insitute of Physics 244111-8 Altis et al. FIG. 7. 共Color online兲 Probability densities of the first six principal components obtained from the sin/cos 共full lines兲 and the complex 共dashed lines兲 dPCA of Ala10, respectively. respectively 共data not shown兲. Together with the percentage of the fluctuations 共18%, 10%, 8%, 7%, 6%, and 5% for V1 , . . . , V6兲 the quantities ⌬共k兲 n therefore give a quick and valuable interpretation of the conformational changes along the principal components Vk. It is interesting to compare the above results to the outcome of a complex dPCA of the Ala10 trajectory. To check the similarity of the complex and the sin/cos dPCA in this case, Fig. 7 compares the distributions of the sin/cos principal components Vk to the distributions of the corresponding principal components, Re Wn and Im Wn, obtained from the complex dPCA using suitably normalized eigenvectors. Although we find good overall agreement, the correspondence 关Eq. 共18兲兴 is not perfect in all cases 共see Appendix兲. Finally, we wish to investigate whether the polar representation 关Eq. 共17兲兴 of the complex principal components facilitates the interpretation of the energy landscape of Ala10. To this end, Figs. 5共e兲–5共h兲 show the free energy surfaces 共E兲 ⌬G共1 , 2兲, 共F兲 ⌬G共3 , 4兲, 共G兲 ⌬G共r1 , r2兲, and 共H兲 ⌬G共r3 , r4兲. Similar to that found for Ala3, the energy landscape is only a little structured along the weights rn 共mainly along r1兲, thus leaving the main information on the conformational states to the angles n 共mainly 2, 3, and 4兲. A closer analysis reveals, e.g., that 2 separates conformational states with a different dihedral angle 3, while 3 separates conformations with a different dihedral angle 2. Unlike the simpler case of trialanine, where the 共1 , 2兲 representation of the complex dPCA was found to directly reproduce the original 共 , 兲 distribution, however, the polar principal components of Ala10 appear to be equivalent to the results of the standard sin/cos dPCA. Roughly speaking, in both formulations we need about the same number of principal components to identify the same number of conformational states. VIII. CONCLUSIONS We have studied the theoretical foundations of the dPCA in order to clarify the validity and the applicability of the approach. In particular, we have shown that dPCA amounts to a one-to-one representation of the original angle distribu- J. Chem. Phys. 126, 244111 共2007兲 tion and that its principal components can be characterized by the corresponding conformational changes of the peptide. Furthermore, we have investigated a complex version of the dPCA which sheds some light on the mysterious doubling of variables occurring in the sin/cos dPCA. One learns that N angular variables can actually be represented by N complex variables, which then naturally lead to N eigenvalues and eigenvectors. Despite its similarity to the sin/cos dPCA, the complex dPCA might be advantageous because the representation of the complex principal components by their weights and angles may facilitate their direct interpretation in terms of simple physical variables. To demonstrate the potential of the dPCA, we have applied it in the construction of the energy landscape of Ala10 from a 300 ns MD simulation. The resulting free energy surface exhibits numerous well-separated minima corresponding to specific conformational states, revealing that the unfolded state of decaalanine is rather structured than random. The smooth appearance of the energy landscape obtained from a PCA using Cartesian coordinates was found to be caused by an artifact of the mixing of internal and overall motion. Hence the correct separation of internal and overall motion is essential for the construction and interpretation of the energy landscape of a biomolecule undergoing large structural rearrangements. Internal coordinates such as dihedral angles fulfill this requirement in a natural way. Recently, several nonlinear approaches have been proposed25–28 which may account for nonlinear correlations not detected by a standard PCA. For example, it has been discussed in Ref. 26 that completely correlated motion such as two atoms oscillating in parallel direction but with a 90° phase shift is not monitored by a linear PCA, since 具sin共t兲sin共t + / 2兲典 = 0. This geometrical artifact caused by the relative orientation of the atomic fluctuations was found to lead to a considerable 共⬇40% 兲 underestimation of the correlation of protein motion.26 Because of the use of dihedral angles and the inherent nonlinear transformation, the dPCA represents a nonlinear PCA with respect to Cartesian atomic coordinates and is therefore able to identify this type of fluctuations. Furthermore, various methods have been suggested which allow for an identification of metastable conformational states.12,21–24 By calculating the transition matrix that connects these states, one may then model the conformational dynamics of the system via a master-equation description. While the dPCA also allows us to calculate metastable conformational states and their transition matrix,17 it moreover provides a way to represent the free energy landscape as well as all observables of the system in terms of well-defined collective coordinates.46 This way the dPCA free energy surface can be used to perform 共equilibrium or nonequilibrium兲 Langevin simulations of the molecular dynamics47,48 as well as a simulation using a nonlinear dynamic model.28 As all quantities of interest can be converged to the desired accuracy by including more principal components, the approach avoids problems associated with the use of empirical order parameters 共such as the number of native contacts兲 or low- Downloaded 20 Jul 2007 to 141.2.216.130. Redistribution subject to AIP license or copyright, see http://jcp.aip.org/jcp/copyright.jsp © Copyright American Institute of Physics. This article may be downloaded for personal use only. Any other use requires prior permission of the author and the American Insitute of Physics 244111-9 J. Chem. Phys. 126, 244111 共2007兲 Dihedral angle principal component analysis dimensional reaction coordinates 共such as the radius of gyration兲, which may lead to artifacts and an oversimplification of the free energy landscape.49 ACKNOWLEDGMENTS The authors thank Yuguang Mu and Alessandra Villa for numerous inspiring and helpful discussions. This work has been supported by the Frankfurt Center for Scientific Computing, the Fonds der Chemischen Industrie, and the Deutsche Forschungsgemeinschaft. Cw共1兲 ª C共x1 − ix2,0兲T = 共1 + 2兲w共1兲 ¬ 1w共1兲 , Cw共2兲 ª C共0,x3 − ix4兲T = 共3 + 4兲w共2兲 ¬ 2w共2兲 , 共A5兲 which reveals the simple relation 关Eq. 共19兲兴 between the eigenvalues k of the sin/cos dPCA and the eigenvalues n of the complex dPCA. By comparing the principal components Wn = w共n兲Tz 共n = 1 , 2兲 and Vk = v共k兲 · q 共k = 1 , . . . , 4兲, we finally obtain the equality 关Eq. 共18兲兴 of the principal components of the two formulations, Re W1 = V1, Im W1 = V2 , Re W2 = V3, Im W2 = V4 . 共A6兲 APPENDIX: RELATION BETWEEN SIN/COS AND COMPLEX dPCA The purpose of the appendix is to discuss the relations of the principal components 关Eq. 共18兲兴 and the eigenvalues 关Eq. 共19兲兴 between the sin/cos and the complex dPCA, respectively. To this end, we first establish a correspondence between the covariance matrices of the two formulations. Using Euler’s formula, we express the matrix elements of the covariance matrix 关Eq. 共15兲兴 as Cmn = 具共eim − 具eim典兲共e−in − 具e−im典兲典 = cov共cos m,cos n兲 + cov共sin m,sin n兲 − i cov共cos m,sin n兲 + i cov共sin m,cos n兲, 共A1兲 where cov共a , b兲 = 具ab典 − 具a典具b典. Without loss of generality 共since the generalization is straightforward兲, we restrict ourselves in the following to the case of two angles 共N = 2兲. Using Eq. 共A1兲 and the definition of 关Eq. 共7兲兴 together with Eq. 共10兲, it is easy to see that one can transform the sin/cos covariance matrix into the complex covariance matrix C according to TT† = C, where T= 冉 共A2兲 1 −i 0 0 0 0 1 −i 冊 . 共A3兲 Let us next derive Eqs. 共18兲 and 共19兲 for the limiting case of two uncorrelated angle variables. The resulting covariance matrix of the sin/cos dPCA exhibits a block-diagonal structure with 2 ⫻ 2 blocks A and B. Assuming that 共x1 , x2兲T is an eigenvector of A with eigenvalue 1, then, due to orthogonality, 共−x2 , x1兲T is an eigenvector of A, too. Let its eigenvalue be 2. Analogously, let 共x3 , x4兲T and 共−x4 , x3兲T be the eigenvectors of B with eigenvalues 3 and 4. It follows that v共1兲 = 共x1,x2,0,0兲T, v共2兲 = 共− x2,x1,0,0兲T , v共3兲 = 共0,0,x3,x4兲T, v共4兲 = 共0,0,− x4,x3兲T 共A4兲 are eigenvectors of with eigenvalues 1 , . . . , 4. Using Eq. 共A2兲, it is now straightforward to verify that the eigenvectors w共n兲 of the complex dPCA can be defined as follows: We note that the above definition of the principal components Wn is not equivalent to the projection w共n兲 · z given by a Hermitian inner product. However, the appealingly simple relation 关Eq. 共18兲兴 between the principal components of the two dPCA methods only holds when the Wn are defined that way. While a 2 ⫻ 2 block-diagonal structure of the sin/cos covariance matrix represents a sufficient condition, it is certainly not a necessary requirement to yield relations 共18兲 and 共19兲. In the case of trialanine, where the latter equations were satisfied to high accuracy 共see Fig. 3兲, the covariance matrix was indeed approximately block diagonal. On the other hand, our second example Ala10 also satisfied the equalities quite well 共see Fig. 7兲, although revealed only little block diagonal structure. Finally, we found cases where the correspondence holds for covariance matrices that are not blockdiagonal at all. For example, it can be shown that two completely correlated angle variables 共say, 1 and 2 = 1 + const兲 result in dPCA covariance matrices that satisfy Eqs. 共18兲 and 共19兲. 1 W. F. van Gunsteren, D. Bakowies, R. Baron et al., Angew. Chem., Int. Ed. 45, 4064 共2007兲. 2 J. N. Onuchic, Z. L. Schulten, and P. G. Wolynes, Annu. Rev. Phys. Chem. 48, 545 共1997兲. 3 K. A. Dill and H. S. Chan, Nat. Struct. Biol. 4, 10 共1997兲. 4 D. J. Wales, Energy Landscapes 共Cambridge University Press, Cambridge, 2003兲. 5 I. T. Jolliffe, Principal Component Analysis 共Springer, New York, 2002兲. 6 T. Ichiye and M. Karplus, Proteins 11, 205 共1991兲. 7 A. E. Garcia, Phys. Rev. Lett. 68, 2696 共1992兲. 8 A. Amadei, A. B. M. Linssen, and H. J. C. Berendsen, Proteins 17, 412 共1993兲. 9 S. Hayward, A. Kitao, F. Hirata, and N. Go, J. Mol. Biol. 234, 1207 共1993兲. 10 O. M. Becker, Proteins 27, 213 共1997兲. 11 O. F. Lange and H. Grubmüller, J. Phys. Chem. B 110, 22842 共2006兲. 12 F. Noe, D. Krachtus, J. C. Smith, and S. Fischer, J. Chem. Theory Comput. 2, 840 共2006兲. 13 R. Abseher and M. Nilges, J. Mol. Biol. 279, 911 共1998兲. 14 D. M. D. van Aalten, B. L. de Groot, J. B. C. Finday, H. J. C. Berendsen, and A. Amadei, J. Comput. Chem. 18, 169 共1997兲. 15 N. Elmaci and R. S. Berry, J. Chem. Phys. 110, 10606 共1999兲. 16 T. H. Reijmers, R. Wehrens, and L. M. C. Buydens, Chemom. Intell. Lab. Syst. 56, 61 共2001兲. 17 Y. Mu, P. H. Nguyen, and G. Stock, Proteins 58, 45 共2005兲. 18 G. E. Sims, I.-G. Choi, and S.-H. Kim, Proc. Natl. Acad. Sci. U.S.A. 102, 618 共2005兲. 19 J. Wang and R. Brüschweiler, J. Chem. Theory Comput. 2, 18 共2006兲. 20 B. Alakent, P. Doruker, and M. C. Camurdan, J. Chem. Phys. 121, 4756 共2004兲. Downloaded 20 Jul 2007 to 141.2.216.130. Redistribution subject to AIP license or copyright, see http://jcp.aip.org/jcp/copyright.jsp © Copyright American Institute of Physics. This article may be downloaded for personal use only. Any other use requires prior permission of the author and the American Insitute of Physics 244111-10 21 J. Chem. Phys. 126, 244111 共2007兲 Altis et al. V. Schultheis, T. Hirschberger, H. Carstens, and P. Tavan, J. Chem. Theory Comput. 1, 515 共2005兲. 22 A. Ma and A. R. Dinner, J. Phys. Chem. B 109, 6769 共2005兲. 23 E. Meerbach, E. Dittmer, I. Horenko, and C. Schütte, Lect. Notes Phys. 703, 475 共2006兲. 24 J. D. Chodera, W. C. Swope, J. W. Pitera, and K. A. Dill, Multiscale Model. Simul. 5, 1214 共2006兲. 25 P. Das, M. Moll, H. Stamati, L. E. Kavraki, and C. Clementi, Proc. Natl. Acad. Sci. U.S.A. 103, 9885 共2006兲. 26 O. F. Lange and H. Grubmüller, Proteins 62, 1052 共2006兲. 27 P. H. Nguyen, Proteins 65, 898 共2006兲. 28 R. Hegger, A. Altis, P. H. Nguyen, and G. Stock, Phys. Rev. Lett. 98, 028102 共2007兲. 29 K. Hinsen, Proteins 64, 795 共2006兲. 30 Y. Mu, P. H. Nguyen, and G. Stock, Proteins 64, 798 共2006兲. 31 N. I. Fisher, Statistical Analysis of Circular Data 共Cambridge University Press, Cambridge, 1996兲. 32 S. Woutersen and P. Hamm, J. Phys. Chem. B 104, 11316 共2000兲. 33 S. Woutersen, R. Pfister, P. Hamm, Y. Mu, D. Kosov, and G. Stock, J. Chem. Phys. 117, 6833 共2002兲. 34 R. Schweitzer-Stenner, F. Eker, Q. Huang, and K. Griebenow, J. Am. Chem. Soc. 123, 9628 共2001兲. 35 J. Graf, P. H. Nguyen, G. Stock, and H. Schwalbe, J. Am. Chem. Soc. 129, 1179 共2007兲. 36 Y. Mu and G. Stock, J. Phys. Chem. B 106, 5294 共2002兲. 37 Y. Mu, D. S. Kosov, and G. Stock, J. Phys. Chem. B 107, 5064 共2003兲. 38 S. Gnanakaran and A. E. Garcia, J. Phys. Chem. B 107, 12555 共2003兲. 39 H. J. C. Berendsen, D. van der Spoel, and R. van Drunen, Comput. Phys. Commun. 91, 43 共1995兲. D. van der Spoel, E. Lindahl, B. Hess, G. Groenhof, A. E. Mark, and H. J. C. Berendsen, J. Comput. Chem. 26, 1701 共2005兲. 41 W. F. van Gunsteren, S. R. Billeter, A. A. Eising, P. H. Hünenberger, P. Krüger, A. E. Mark, W. R. P. Scott, and I. G. Tironi, Biomolecular Simulation: The GROMOS96 Manual and User Guide 共Vdf Hochschulverlag AG an der ETH Zürich, Zürich, 1996兲. 42 H. J. C. Berendsen, J. P. M. Postma, W. F. van Gunsteren, and J. Hermans, in Intermolecular Forces, edited by B. Pullman 共Reidel, Dordrecht, 1981兲, pp. 331–342. 43 T. Darden, D. York, and L. Petersen, J. Chem. Phys. 98, 10089 共1993兲. 44 A direct back-calculation of the dihedral angles is not possible. But since the time indices of the original trajectory and the principal components are identical, we can use these indices to identify corresponding dihedral angles. 45 Details of the identification of the metastable conformational states and their transition matrices are given in Ref. 17. 46 As the complete analysis is performed in the space of dihedral angle principal components, there is no need to invoke the Jacobian transformation between these coordinates and the atomic Cartesian coordinates 共Ref. 50兲. 47 O. F. Lange and H. Grubmüller, J. Chem. Phys. 124, 214903 共2006兲. 48 S. Yang, J. N. Onuchic, and H. Levine, J. Chem. Phys. 125, 054910 共2006兲. 49 S. V. Krivov and M. Karplus, Proc. Natl. Acad. Sci. U.S.A. 101, 14766 共2004兲. 50 S. He and H. A. Scheraga, J. Chem. Phys. 108, 271 共1998兲. 40 Downloaded 20 Jul 2007 to 141.2.216.130. Redistribution subject to AIP license or copyright, see http://jcp.aip.org/jcp/copyright.jsp