CLIP-based Neural Neighbor Style Transfer for 3D Assets

Abstract

We present a method for transferring the style from a set of images to a 3D object. The texture appearance of an asset is optimized with a differentiable renderer in a pipeline based on losses using pretrained deep neural networks. More specifically, we utilize a nearest-neighbor feature matching loss with CLIP-ResNet50 to extract the style from images. We show that a CLIP- based style loss provides a different appearance over a VGG-based loss by focusing more on texture over geometric shapes. Additionally, we extend the loss to support multiple images and enable loss-based control over the color palette combined with automatic color palette extraction from style images.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…