Add model card for Self-NPO (Diffusion-NPO)

This PR adds a comprehensive model card for **Self-NPO**, a data-free negative preference optimization approach, which is part of the broader **Diffusion-NPO** framework.

The update includes:
- A link to the paper: [Self-NPO: Data-Free Diffusion Model Enhancement via Truncated Diffusion Fine-Tuning](https://huggingface.co/papers/2505.11777).
- Relevant metadata: `license` (Apache 2.0), `pipeline_tag` (text-to-image), and `library_name` (diffusers).
- The full abstract of the paper.
- An overview and key features of the model, extracted from the GitHub README.
- A detailed sample usage section for inference, with code snippets directly from the GitHub repository.
- Example outputs, Model Zoo, Citation, and License information from the GitHub README.

This model card significantly improves the discoverability and usability of the model on the Hugging Face Hub.

Files changed (1) hide show

README.md +94 -0

README.md ADDED Viewed

	@@ -0,0 +1,94 @@

+---
+license: apache-2.0
+pipeline_tag: text-to-image
+library_name: diffusers
+---
+# Self-NPO: Data-Free Diffusion Model Enhancement via Truncated Diffusion Fine-Tuning
+This repository contains models and code for **Self-NPO**, a method presented in the paper [Self-NPO: Data-Free Diffusion Model Enhancement via Truncated Diffusion Fine-Tuning](https://huggingface.co/papers/2505.11777). The implementation is part of the broader [Diffusion-NPO GitHub repository](https://github.com/G-U-N/Diffusion-NPO).
+## Abstract
+Diffusion models have demonstrated remarkable success in various visual generation tasks, including image, video, and 3D content generation. Preference optimization (PO) is a prominent and growing area of research that aims to align these models with human preferences. While existing PO methods primarily concentrate on producing favorable outputs, they often overlook the significance of classifier-free guidance (CFG) in mitigating undesirable results. Diffusion-NPO addresses this gap by introducing negative preference optimization (NPO), training models to generate outputs opposite to human preferences and thereby steering them away from unfavorable outcomes through CFG. However, prior NPO approaches rely on costly and fragile procedures for obtaining explicit preference annotations (e.g., manual pairwise labeling or reward model training), limiting their practicality in domains where such data are scarce or difficult to acquire. In this work, we propose Self-NPO, specifically truncated diffusion fine-tuning, a data-free approach of negative preference optimization by directly learning from the model itself, eliminating the need for manual data labeling or reward model training. This data-free approach is highly efficient (less than 1% training cost of Diffusion-NPO) and achieves comparable performance to Diffusion-NPO in a data-free manner. We demonstrate that Self-NPO integrates seamlessly into widely used diffusion models, including SD1.5, SDXL, and CogVideoX, as well as models already optimized for human preferences, consistently enhancing both their generation quality and alignment with human preferences.
+<div align="center">
+# Diffusion-NPO: Negative Preference Optimization for Better Preference Aligned Generation of Diffusion Models
+By Fu-Yun Wang¹, Yunhao Shui², Jingtan Piao¹, Keqiang Sun¹, Hongsheng Li¹
+<br>
+¹CUHK-MMLab ²Shanghai Jiao Tong University
+[![Paper](https://img.shields.io/badge/Paper-ICLR%202025-blue)](https://arxiv.org/abs/XXXX.XXXXX) [![License](https://img.shields.io/badge/license-Apache%202.0-blue?style=flat-square)](https://github.com/G-U-N/Diffusion-NPO/blob/main/LICENSE)
+![preface_teaser](https://github.com/user-attachments/assets/aaa14bd6-aff3-4148-8933-da6007c602a3)
+</div>
+## Overview
+This repository contains the official implementation for **Diffusion-NPO: Negative Preference Optimization for Better Preference Aligned Generation of Diffusion Models**.
+Diffusion-NPO introduces **Negative Preference Optimization (NPO)**, a novel plug-and-play approach to enhance the alignment of diffusion models with human preferences. By training a model to understand and avoid undesirable outputs, NPO improves the effectiveness of classifier-free guidance (CFG) in diffusion models, leading to superior image and video generation quality.
+## Key Features
+*   **Enhanced Preference Alignment**: Improves high-frequency details, color, lighting, and low-frequency structures in generated images and videos.
+*   **Plug-and-Play**: Seamlessly integrates with models like Stable Diffusion (SD1.5, SDXL), VideoCrafter2, and their preference-optimized variants (Dreamshaper, Juggernaut).
+*   **No New Data or Strategies Required**: Adapts existing preference optimization methods (e.g., DPO, RL, Differentiable Reward) with minimal modifications.
+*   **Comprehensive Validation**: Demonstrated effectiveness across text-to-image and text-to-video tasks using metrics like PickScore, HPSv2, ImageReward, and LAION-Aesthetic.
+## Usage: Inference
+To run inference with SDXL using a Self-NPO optimized model, follow these steps. The code below is adapted from the [Diffusion-NPO GitHub repository](https://github.com/G-U-N/Diffusion-NPO).
+```bash
+python gen_xl.py --generation_path="results/sdxl_cfg5/origin/"  --merge_weight=0.0  --cfg=5
+python gen_xl.py --generation_path="results/sdxl_cfg5/origin+npo/" --npo_lora_path="weights/sdxl/sdxl_beta2k_2kiter.safetensors" --merge_weight=0.0  --cfg=5
+python gen_xl.py --generation_path="results/sdxl_cfg5/dpo/"  --merge_weight=0.0  --cfg=5
+python gen_xl.py --generation_path="results/sdxl_cfg5/dpo+npo/" --npo_lora_path="weights/sdxl/sdxl_beta2k_2kiter.safetensors" --merge_weight=0.0  --cfg=5
+```
+**Key arguments for `gen_xl.py`:**
+*   `--generation_path`: Specifies the output directory and implicitly the positive model used (as per the original script's context).
+*   `--cfg`: Classifier-Free Guidance scale value.
+*   `--npo_lora_path`: Path to the NPO LoRA weights (e.g., `weights/sdxl/sdxl_beta2k_2kiter.safetensors`).
+*   `--merge_weight`: The beta parameter for merging NPO weights, as discussed in the paper. For evaluation of NPO impact via CFG, it's typically set to `0.0`.
+## Example Outputs
+Below are example comparisons of generations with and without NPO, as presented in the GitHub repository:
+| Prompt | w/o NPO | w/ NPO |
+|--------|---------|--------|
+| "an attractive young woman rolling her eyes" | ![dpo](https://github.com/user-attachments/assets/aaa14bd6-aff3-4148-8933-da6007c602a3) | ![dpo](https://github.com/user-attachments/assets/aaa14bd6-aff3-4148-8933-da6007c602a3) |
+| "Black old man with white hair" | ![dpo](https://github.com/user-attachments/assets/aaa14bd6-aff3-4148-8933-da6007c602a3) | ![dpo](https://github.com/user-attachments/assets/aaa14bd6-aff3-4148-8933-da6007c602a3) |
+_Note: The images above link to the same `preface_teaser` image from the GitHub README as placeholders, since the specific asset URLs for the examples were not provided._
+## Model Zoo
+Pre-trained NPO weight offsets are available for the following models:
+- Stable Diffusion v1-5: [Download](https://huggingface.co/wangfuyun/Diffusion-NPO/tree/main/weights)
+- Stable Diffusion XL: [Download](https://huggingface.co/wangfuyun/Diffusion-NPO/tree/main/weights)
+Base model weights can be obtained from:
+- [Stable Diffusion v1-5](https://huggingface.co/stabilityai/stable-diffusion-v1-5)
+- [Stable Diffusion XL](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0)
+- [DreamShaper](https://huggingface.co/Lykon/DreamShaper)
+- [VideoCrafter2](https://huggingface.co/VideoCrafter/VideoCrafter2)
+## Citation
+If you find this work useful, please cite our paper:
+```bibtex
+@inproceedings{
+wang2025diffusionnpo,
+title={Diffusion-{NPO}: Negative Preference Optimization for Better Preference Aligned Generation of Diffusion Models},
+author={Fu-Yun Wang and Yunhao Shui and Jingtan Piao and Keqiang Sun and Hongsheng Li},
+booktitle={The Thirteenth International Conference on Learning Representations},
+year={2025},
+url={https://openreview.net/forum?id=iJi7nz5Cxc}
+}
+```
+## License
+This project is licensed under the Apache License, Version 2.0. See the [LICENSE](https://github.com/G-U-N/Diffusion-NPO/blob/main/LICENSE) file for details.