Identification of galaxy shreds in large photometric catalogs using Convolutional Neural Networks
Abstract
Contamination from galaxy fragments, identified as sources, is a major issue in large photometric galaxy catalogs. In this paper, we prove that this problem can be easily addressed with computer vision techniques. We use image cutouts to train a convolutional neural network (CNN) to identify catalogued sources that are in reality just star formation regions and/or shreds of larger galaxies. The CNN reaches an accuracy ~98% on our testing datasets. We apply this CNN to galaxy catalogs from three amongst the largest surveys available today: the Sloan Digital Sky Survey (SDSS), the DESI Legacy Imaging Surveys and the Panoramic Survey Telescope and Rapid Response System Survey (Pan-STARSS). We find that, even when strict selection criteria are used, all catalogs still show a ~5% level of contamination from galaxy shreds. Our CNN gives a simple yet effective solution to clean galaxy catalogs from these contaminants.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.