AIpparel: A Multimodal Foundation Model for Digital Garments

publication
CVPR 2025
authors
Kiyohiro Nakayama, Jan Ackermann, Timur Levent Kesdogan, Yang Zheng, Maria Korosteleva, Olga Sorkine-Hornung, Leonidas Guibas, Guandao Yang, Gordon Wetzstein
award
selected as Highlight at CVPR 2025

abstract

Apparel is essential to human life, offering protection, mirroring cultural identities, and showcasing personal style. Yet, the creation of garments remains a time-consuming process, largely due to the manual work involved in designing them. To simplify this process, we introduce AIpparel, a multimodal foundation model for generating and editing sewing patterns. Our model fine-tunes state-of-the-art large multimodal models (LMMs) on a custom-curated large-scale dataset of over 120,000 unique garments, each with multimodal annotations including text, images, and sewing patterns. Additionally, we propose a novel tokenization scheme that concisely encodes these complex sewing patterns so that LLMs can learn to predict them efficiently. AIpparel achieves state-of-the-art performance in single-modal tasks, including text-to-garment and image-to-garment prediction, and enables novel multi-modal garment generation applications such as interactive garment editing

downloads

acknowledgments

The project is supported by Google, an ERC Consolidator Grant No. 101003104 (MYCLOTH), an ARL grant W911NF-21-2-0104, a Vannevar Bush Faculty Fellowship, and LVMH.