A Pre-Docking Filter Based on Image Recognition
Abstract
Molecular docking is a central method in the computer-based screening of compound libraries as a part of the rational approach to drug design. Although the method has proved its competence in predicting binding modes correctly, its inherent complexity puts high demands on computational resources. Moreover the chemical space to be screened is prohibitively large. Therefore the application of filtering prior to docking is a promising concept. We implemented a pre-docking filter based on the tangent distance algorithm originally conceived for optical character recognition. The challenging transfer of the method from two-dimensional to three-dimensional data was achieved by representing the molecular structure by a set of density maps extracted from different views of the compound. Additionally, our program applies a binary classification using principal component analysis. Ligand and binding pocket are aligned according to their centroidal axes, enabling a size-based filtering for the purpose of enriching the dataset regarding ligands before docking. The evaluation of our program via redocking produced RMSD values between 8 and 25, indicating that the tangent distance approach is not suited for optimizing the orientation of a ligand and binding pocket. Investigating probable explanations lead to the conclusion that a likely cause for these results is the method's known inability to approximate large transformations. A validation of the principal component analysis alone performed better: Tests on a dataset of 170 ligands and 6,435 decoys yielded a sensitivity of 0.81, while keeping the runtime within a reasonable timeframe (1 to 4 seconds). The dataset's enrichment increased from 2.64% to 2.82%.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.