Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug]: SDXL based V-pred models are treated as epsilon prediction - result noisy images #7495

Open
1 task done
bWm-nubby opened this issue Dec 24, 2024 · 0 comments · May be fixed by #7504
Open
1 task done

[bug]: SDXL based V-pred models are treated as epsilon prediction - result noisy images #7495

bWm-nubby opened this issue Dec 24, 2024 · 0 comments · May be fixed by #7504
Labels
bug Something isn't working

Comments

@bWm-nubby
Copy link

Is there an existing issue for this problem?

  • I have searched the existing issues

Operating system

Windows

GPU vendor

Nvidia (CUDA)

GPU model

RTX 3080 12GB

GPU VRAM

12GB

Version number

5.5

Browser

Invoke Community Edition v5.5 launcher / MS Edge 131.0.2903.99 (Official build) (64-bit)

Python dependencies

accelerate==1.0.1

compel==2.0.2

cuda==12.4

diffusers==0.31.0

numpy==1.26.3

opencv==4.9.0.80

onnx==1.16.1

pillow==10.2.0

python==3.11.11

torch==2.4.1+cu124

torchvision==0.19.1+cu124

transformers==4.46.3

xformers==Not Installed

What happened

SDXL based v_prediction models are assigned epsilon prediction type automatically even when vpred and zsnr state_dict keys are present in the model, and manually changing prediction type to v_prediction is not respected. This results in unusably noisy outputs from these models.
b6c91ac0-aad5-4427-a9e6-a879df825bef

What you expected to happen

v_prediction based models should have the correct prediction type detected based on the state_dict keys within the model metadata. In the event that this fails due to missing keys or any other reason, the user's manually selected prediction type under model settings should be respected resulting in normal quality outputs.

How to reproduce the problem

  1. Download the V-Pred-1.0-Version of this model noobai-xl-nai-xl
  2. Add the model through Invoke-AI's model management ui
  3. Model will be detected as an epsilon prediction model
  4. Change prediction type manually to v_prediction
  5. Generate image with the downloaded model
  6. The output will be extremely noisy and low quality

Additional context

I also attempted converting the model to Diffusers format both before and after manually setting the prediction type with no change in results. Additionally, the option to enable zsnr does not seem to exist in Invoke-AI though that seems to be a missing feature rather than a bug.

Discord username

bwm_nubby

@bWm-nubby bWm-nubby added the bug Something isn't working label Dec 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant