GarmentZoom: Generating Zoomable Images from Garment Listings
Abstract
Online product listings for garments often include an overview photo and a close-up to show garment details. However, each photo focuses on either field of view or garment detail, forcing users to alternate between views and breaking browsing continuity. We present GarmentZoom, a system that enhances the full-view photo to match the fidelity of its accompanying close-up, enabling seamless zoom-and-pan exploration. Unlike standard reference-based super-resolution, our setting involves close-up references that are spatially unaligned with the full view, and scale factors that vary substantially across garments 3-20×. Prior work typically relies on alignment to transfer details or requires per-instance fine-tuning to memorize them. Instead, we train a single model that supports a continuous range of scales across diverse garments. Our approach synthesizes details without requiring spatial alignment and matches the quality of per-instance methods with a fraction of the training cost.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.