Exact sampling of molecules in chemical space

Abstract

The concept of molecular similarity appears in many machine-learning algorithms based on the assumption that molecules with similar representations will also share similar properties. In this work, we propose a new way to study similarity measures in molecular graph space using a Monte Carlo approach. We enable direct sampling from the underlying distribution of chemical space without numerical approximations or complete enumeration of molecular graphs, the latter intractable for practically relevant graph sets of interest. The Monte Carlo method allows observation of several interesting fundamental properties of chemical space, such as a linear trend of average property derivatives in chemical space with respect to the property's value at the molecule of interest. The trend was observed for extensive and intensive properties, suggesting that this trend is an inherent property of chemical space.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…