Representations of Materials for Machine Learning
Abstract
High-throughput data generation methods and machine learning (ML) algorithms have given rise to a new era of computational materials science by learning relationships among composition, structure, and properties and by exploiting such relations for design. However, to build these connections, materials data must be translated into a numerical form, called a representation, that can be processed by a machine learning model. Datasets in materials science vary in format (ranging from images to spectra), size, and fidelity. Predictive models vary in scope and property of interests. Here, we review context-dependent strategies for constructing representations that enable the use of materials as inputs or outputs of machine learning models. Furthermore, we discuss how modern ML techniques can learn representations from data and transfer chemical and physical information between tasks. Finally, we outline high-impact questions that have not been fully resolved and thus, require further investigation.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.