Predicting second virial coefficients of organic and inorganic compounds using Gaussian Process Regression

Abstract

We show that by using intuitive and accessible molecular features it is possible to predict the temperature-dependent second virial coefficient of organic and inorganic compounds using Gaussian process regression. In particular, we built a low dimensional representation of features based on intrinsic molecular properties, topology and physical properties relevant for the characterization of molecule-molecule interactions. The featurization was used to predict second virial coefficients in the interpolative regime with a relative error 1\% and to extrapolate the prediction to temperatures outside of the training range for each compound in the dataset with a relative error of 2.14\%. Additionally, the model's predictive abilities were extended to organic molecules unseen in the training process, yielding a prediction with a relative error of 2.66\%. Therefore, apart from being robust, the present Gaussian process regression model is extensible to a variety of organic and inorganic compounds.

0

Discussion (0)

Sign in to join the discussion.

Loading comments…