Instructions to use mingyi456/Z-Image-Distilled-DF11 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use mingyi456/Z-Image-Distilled-DF11 with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("mingyi456/Z-Image-Distilled-DF11", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Diffusion Single File
How to use mingyi456/Z-Image-Distilled-DF11 with Diffusion Single File:
# No code snippets available yet for this library. # To use this model, check the repository files and the library's documentation. # Want to help? PRs adding snippets are welcome at: # https://github.com/huggingface/huggingface.js
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- Draw Things
- DiffusionBee
For more information (including how to compress models yourself), check out https://huggingface.co/DFloat11 and https://github.com/LeanModels/DFloat11
Feel free to request for other models for compression as well (for either the diffusers library, ComfyUI, or any other model), although models that use architectures which are unfamiliar to me might be more difficult.
How to Use
diffusers
import torch
from diffusers import ZImagePipeline, ZImageTransformer2DModel
from dfloat11 import DFloat11Model
# from transformers.modeling_utils import no_init_weights # for transformers<5.0.0
from transformers.initialization import no_init_weights # for transformers>=5.0.0
pattern_dict = {
r"noise_refiner\.\d+": (
"attention.to_q",
"attention.to_k",
"attention.to_v",
"attention.to_out.0",
"feed_forward.w1",
"feed_forward.w2",
"feed_forward.w3",
"adaLN_modulation.0"
),
r"context_refiner\.\d+": (
"attention.to_q",
"attention.to_k",
"attention.to_v",
"attention.to_out.0",
"feed_forward.w1",
"feed_forward.w2",
"feed_forward.w3",
),
r"layers\.\d+": (
"attention.to_q",
"attention.to_k",
"attention.to_v",
"attention.to_out.0",
"feed_forward.w1",
"feed_forward.w2",
"feed_forward.w3",
"adaLN_modulation.0"
),
r"cap_embedder": (
"1",
)
}
text_encoder = DFloat11Model.from_pretrained("DFloat11/Qwen3-4B-DF11", device="cpu")
with no_init_weights():
transformer = ZImageTransformer2DModel.from_config(
ZImageTransformer2DModel.load_config(
"Tongyi-MAI/Z-Image-Turbo", subfolder="transformer"
),
torch_dtype=torch.bfloat16
).to(torch.bfloat16)
# Make sure to download the file first, and edit the filepath accordingly
DFloat11Model.from_single_file(
r".\RedZFUN-v6-ZIB-Distilled-AGILE-8steps-BF16-ComfyUI-DF11.safetensors",
device='cpu',
bfloat16_model=transformer,
pattern_dict=pattern_dict
)
pipe = ZImagePipeline.from_pretrained(
"Tongyi-MAI/Z-Image-Turbo",
text_encoder=text_encoder,
transformer=transformer,
torch_dtype=torch.bfloat16,
low_cpu_mem_usage=False,
)
pipe.to("cuda")
ComfyUI
Refer to this model instead.
Compression details
This is the pattern_dict for compression:
pattern_dict = {
r"noise_refiner\.\d+": (
"attention.to_q",
"attention.to_k",
"attention.to_v",
"attention.to_out.0",
"feed_forward.w1",
"feed_forward.w2",
"feed_forward.w3",
"adaLN_modulation.0"
),
r"context_refiner\.\d+": (
"attention.to_q",
"attention.to_k",
"attention.to_v",
"attention.to_out.0",
"feed_forward.w1",
"feed_forward.w2",
"feed_forward.w3",
),
r"layers\.\d+": (
"attention.to_q",
"attention.to_k",
"attention.to_v",
"attention.to_out.0",
"feed_forward.w1",
"feed_forward.w2",
"feed_forward.w3",
"adaLN_modulation.0"
),
r"cap_embedder": (
"1",
)
}
- Downloads last month
- 61
# No code snippets available yet for this library. # To use this model, check the repository files and the library's documentation. # Want to help? PRs adding snippets are welcome at: # https://github.com/huggingface/huggingface.js