Pair distribution function analysis for oxide defect identification through feature extraction and supervised learning
Abstract
Feature extraction and a neural network model are applied to predict the defect types and concentrations in experimental TiO2 samples. A dataset of TiO2 structures with vacancies and interstitials of oxygen and titanium is built and the structures are relaxed using energy minimization. The features of the calculated pair distribution functions (PDFs) of these defected structures are extracted using linear methods (principal component analysis, non-negative matrix factorization) and non-linear methods (autoencoder, convolutional neural network). The extracted features are used as the inputs to a neural network that maps the feature weights to the concentration of each defect type. The performance of this machine learning pipeline is validated by predicting the defect concentrations based on experimentally-measured TiO2 PDFs and comparing the results to brute-force predictions. A physics-based initialization of the autoencoder has the highest accuracy in predicting the defect concentrations. This model incorporates physical interpretability and predictability of material properties, enabling a more efficient material characterization process with scattering data.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.