Model Selection with Regression and Representational Similarity Analysis for Linear and Nonlinear Data
Abstract
In cognitive psychology and neuroscience, adjudicating between competing theoretical models is a common methodological challenge. Researchers often rely on either first-order direct mapping approaches (e.g., linear regression) or second-order abstraction methods (e.g., Representational Similarity Analysis [RSA]). However, it remains unclear whether or how the nature of the underlying data and feature characteristics affect the performance of these methods. Here, we systematically evaluated regression, RSA, and Pattern Component Modeling (PCM) across distinct data-generating schemes, including first-order linear mappings and geometry-to-first-order transformations with either linear or nonlinear sigmoid readouts, using both univariate behavioral and multivariate fMRI spatial-pattern simulations. Our results suggest that the relative performance of these methods depends on the underlying generative mechanism. Under linear generative assumptions, regression and PCM showed higher model-selection accuracy than RSA. Under nonlinear but order-preserving transformations, rank-based RSA showed an advantage over regression and PCM. We also found that feature multicollinearity affected these methods differently across generative schemes, and that orthogonalizing the predictor space via principal component analysis (PCA) reduced several collinearity-related differences. Finally, analyses of empirical datasets were consistent with the simulation results under approximately linear conditions, with regression showing clearer model discrimination than RSA. Overall, these findings suggest that the relative performance of regression, RSA, and PCM depends on the form of the mapping between features and responses, as well as on the structure of the feature space.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.