Cross-Person Virtual Try-On

Cross-Person Virtual Try-On | ECE285: Deep Generative Models | Prof. Pengtao Xie

- Leveraged IDM-VITON preprocessing to generate dataset for cross person garment transfer along with DensePose segmentation maps to convert VITON-HD images into cloth-agnostic and garment-conditioned token streams for diffusion-transformer training
- Architected and implemented a 16-block Diffusion Transformer denoiser integrating self-attention on noise and cloth-agnostic tokens, cross-attention on garment tokens, and FiLM-based timestep modulation for high-fidelity virtual try-on
- Optimized the diffusion pipeline with noise-aware EDM parameterization, a 2nd ODE solver and CFG achieving an FID of 27.7

[Presentation] [Project Report] [Github]