Benchmarking and Analyzing Generative Data for Visual Recognition
Abstract
Advancements in large pre-trained generative models have expanded their potential as effective data generators in visual recognition. This work delves into the impact of generative images, primarily comparing paradigms that harness external data ( generative retrieval original). Our key contributions are: 1) GenBench Construction: We devise GenBench, a broad benchmark comprising 22 datasets with 2548 categories, to appraise generative data across various visual recognition tasks. 2) CLER Score: To address the insufficient correlation of existing metrics (, FID, CLIP score) with downstream recognition performance, we propose CLER, a training-free metric indicating generative data's efficiency for recognition tasks prior to training. 3) New Baselines: Comparisons of generative data with retrieved data from the same external pool help to elucidate the unique traits of generative data. 4) External Knowledge Injection: By fine-tuning special token embeddings for each category via Textual Inversion, performance improves across 17 datasets, except when dealing with low-resolution reference images. Our exhaustive benchmark and analysis spotlight generative data's promise in visual recognition, while identifying key challenges for future investigation.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.