Clustering of molecular dynamics trajectories via peak-picking in multidimensional PCA-derived distributions

Abstract

We describe a robust, fast, and memory-efficient procedure that can cluster millions of structures derived from molecular dynamics simulations. The essence of the method is based on a peak-picking algorithm applied to three- and five-dimensional distributions of the principal components derived from the trajectory and automatically supports both Cartesian and dihedral PCA-based clustering. The density threshold required for identifying isolated peaks (which correspond to discrete clusters) is determined through the application of a variance-explained criterion which allows for a completely automated clustering procedure with no user intervention. In this communication we describe the algorithm and present some of the results obtained from the application of the method as implemented in the molecular dynamics analysis programs carma, grcarma. and cluster5D. We conclude with a discussion of the limitations and possible pitfalls of this method.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…