What is it?

No?, it says:
RuntimeError: Error(s) in loading state_dict for NextDiT:
size mismatch for x_embedder.weight: copying a param with shape torch.Size([3840, 128]) from checkpoint, the shape in current model is torch.Size([3840, 64]).
tried to load in fp8
Newbie 0.1?

lodestones

Owner 3 days ago

it's a modified z-image with flux 2 vae and some slight arch changes and custom loss
this model is not ready yet it's just started training literally yesterday.

tuolaku

3 days ago

•

edited 3 days ago

Am I dreaming? A new year, a new surprise?

I have a question: why not use the z-Image-De-Turbo model as the base model?

Ares6x

2 days ago

This is exciting, but @loadstones, will LoRAs made for Chroma1-HD still work on this, or will I have to retrain all of my loras? I'm pretty sure my chroma loras don't work with z-image

qpqpqpqpqpqp

2 days ago

"I'm pretty sure my chroma loras don't work with z-image" Yes, don't work, of course

Yndear

2 days ago

@loadstones by the way, why did you decide not to wait for the base version of z-image? It seemed to me that the distilled version is less flexible to fine tuning.

lodestones

Owner 2 days ago

@Yndear this is not fine tuning, this is closer to pretraining and instead of random init, im using z-image as the "initial seed"

the arch is different, it uses DeCo head and flux 2 vae, the loss is different it's an fm-x0 loss instead of fm-velocity

this arch + loss function combo has huge benefit of ridiculously fast convergence. But even with that it still costly to pretrain one from scratch.
it's better to have some residual knowledge as initial seed than starting from blank slate.

tuolaku

1 day ago

@Yndear this is not fine tuning, this is closer to pretraining and instead of random init, im using z-image as the "initial seed"

the arch is different, it uses DeCo head and flux 2 vae, the loss is different it's an fm-x0 loss instead of fm-velocity

this arch + loss function combo has huge benefit of ridiculously fast convergence. But even with that it still costly to pretrain one from scratch.
it's better to have some residual knowledge as initial seed than starting from blank slate.

Is the text encoder still using Qwen3-4B, or will it be replaced with a larger-parameter text encoder?

lodestones

Owner 1 day ago

still qwen

QuickscopingFTW

1 day ago

is there any workflow for comfyui that works with this new mode? also the same for the wip radiance model, the comfyui workflow for it is very old and in wondering if there is a newer one that works better?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment