Unsupervised particle sorting for cryo-EM using probabilistic PCA
Abstract
Single-particle cryo-electron microscopy (cryo-EM) is a leading technology to resolve the structure of molecules. Early in the process, the user detects potential particle images in the raw data. Typically, there are many false detections as a result of high levels of noise and contamination. Currently, removing the false detections requires human intervention to sort the hundred thousands of images. We propose a statistically-established unsupervised algorithm to remove non-particle images. We model the particle images as a union of low-dimensional subspaces, assuming non-particle images are arbitrarily scattered in the high-dimensional space. The algorithm is based on an extension of the probabilistic PCA framework to robustly learn a non-linear model of union of subspaces. This provides a flexible model for cryo-EM data, and allows to automatically remove images that correspond to pure noise and contamination. Numerical experiments corroborate the effectiveness of the sorting algorithm.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.