-
Notifications
You must be signed in to change notification settings - Fork 556
Description
Git commit
Operating System & Version
Windows 10 22H2
GGML backends
CUDA
Command-line arguments used
.\sd-cli.exe --diffusion-model "W:\Z-Image-Turbo\z_image_turbo-Q6_K.gguf" --vae "W:\Z-Image-Turbo\ae.safetensors" --llm "W:\Z-Image-Turbo\Qwen3-4B-Instruct-2507-Q6_K.gguf" -H 1280 -W 960 --cfg-scale 1.0 --steps 10 --diffusion-fa --offload-to-cpu -p "fantasy forest" -o "./o2.png"
Steps to reproduce
[DEBUG] main.cpp:516 - System Info:
SSE3 = 1 | AVX = 1 | AVX2 = 1 | AVX512 = 1 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | VSX = 0 |
What you expected to happen
Picture generation with CPU model offloading
What actually happened
Program crashed with exception code: 0xc000001d (which is 'Illegal Instruction')
Logs / error messages / stack trace
[DEBUG] stable-diffusion.cpp:173 - Using CUDA backend
[INFO ] ggml_extend.hpp:78 - ggml_cuda_init: found 1 CUDA devices:
[INFO ] ggml_extend.hpp:78 - Device 0: NVIDIA GeForce RTX 3060, compute capability 8.6, VMM: yes
[INFO ] stable-diffusion.cpp:267 - loading diffusion model from 'W:\Z-Image-Turbo\z_image_turbo-Q6_K.gguf'
[INFO ] model.cpp:366 - load W:\Z-Image-Turbo\z_image_turbo-Q6_K.gguf using gguf format
[DEBUG] model.cpp:412 - init from 'W:\Z-Image-Turbo\z_image_turbo-Q6_K.gguf'
[INFO ] stable-diffusion.cpp:314 - loading llm from 'W:\Z-Image-Turbo\Qwen3-4B-Instruct-2507-Q6_K.gguf'
[INFO ] model.cpp:366 - load W:\Z-Image-Turbo\Qwen3-4B-Instruct-2507-Q6_K.gguf using gguf format
[DEBUG] model.cpp:412 - init from 'W:\Z-Image-Turbo\Qwen3-4B-Instruct-2507-Q6_K.gguf'
[INFO ] stable-diffusion.cpp:328 - loading vae from 'W:\Z-Image-Turbo\ae.safetensors'
[INFO ] model.cpp:369 - load W:\Z-Image-Turbo\ae.safetensors using safetensors format
[DEBUG] model.cpp:503 - init from 'W:\Z-Image-Turbo\ae.safetensors', prefix = 'vae.'
[INFO ] stable-diffusion.cpp:345 - Version: Z-Image
[INFO ] stable-diffusion.cpp:373 - Weight type stat: f32: 634 | q6_K: 433 | bf16: 28
[INFO ] stable-diffusion.cpp:374 - Conditioner weight type stat: f32: 145 | q6_K: 253
[INFO ] stable-diffusion.cpp:375 - Diffusion model weight type stat: f32: 245 | q6_K: 180 | bf16: 28
[INFO ] stable-diffusion.cpp:376 - VAE weight type stat: f32: 244
[DEBUG] stable-diffusion.cpp:378 - ggml tensor size = 400 bytes
[DEBUG] llm.hpp:286 - merges size 151387
[DEBUG] llm.hpp:318 - vocab size: 151669
PS W:\sd-master-d6dd6d7-bin-win-cuda12-x64>
Additional context / environment details
The same crash happens with any CPU related parameters e.g. --clip-on-cpu
My CPU is Xeon E5-2666 v3 (supports only AVX and AVX2)