UPSTREAM PR #1357: Inpaint imporvements by loci-dev · Pull Request #90 · auroralabs-loci/stable-diffusion.cpp

loci-dev · 2026-03-21T04:56:08Z

Note

Source pull request: leejet/stable-diffusion.cpp#1357

For all models in inpaint mode: Improve mask downsampling to latent size by taking the maximum over the 8x8 patch instead of a single sample in the corner
For inpaint models: Use masked diffusion for inpaint models too, to reduce color shift. That was previously disabled because of some artifacting near the edges of the mask, it is fixed by inflating the mask by 1 latent pixel for diffusion.

Example:

.\build\bin\sd-cli.exe --model ..\ComfyUI\models\checkpoints\sdxl\dreamshaperXL_lightningInpaint.safetensors -p "a dog sitting on a bench" --color --steps 16 --cfg-scale 1 --sampling-method euler_a --preview proj --preview-noisy --img-cfg-scale 1 -i .\bench.png --mask .\bench_mask.png --strength 1

original	mask

master	PR

PR	just enabling masked diffusion without inflating the mask

(not very noticable difference in that specific example, but there are some noticable "floaters" around the masked areas in the image on the right if you look closely)

loci-review · 2026-03-21T05:59:30Z

Target version:

Target version adds ggml_ext_dup_and_cpy_tensor (7.6% of execution time) for mask processing. Core operations (get_pmid_conditon 45.5%, apply 21.2%, sample 18.8%) remain unchanged, confirming performance impact is isolated to new mask preprocessing functionality.

Additional Findings

GPU/ML Operations: All changes are CPU-side mask preprocessing; GPU inference pipeline (text encoding, denoising, VAE operations) completely unaffected. The 70 µs CPU overhead does not impact GPU utilization or inference performance.

Commits: Two commits implement inpainting quality improvements: mask inflation via 3×3 max-pooling (50974ff) and max-pooling downsampling to prevent single-pixel sampling artifacts (f2fb03b). Performance cost is justified by significant visual quality improvements at mask boundaries.

🔎 Full breakdown: Loci Inspector
💬 Questions? Tag @loci-dev

stduhpf added 2 commits March 19, 2026 12:11

inpaint: get max pixel max instead of single sample

f2fb03b

inpaint: masked diffusion for inpainting models with inflated mask

50974ff

loci-dev temporarily deployed to stable-diffusion-cpp-prod March 21, 2026 04:56 — with GitHub Actions Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UPSTREAM PR #1357: Inpaint imporvements#90

UPSTREAM PR #1357: Inpaint imporvements#90
loci-dev wants to merge 2 commits intomainfrom
loci/pr-1357-inpaint-imporvements

loci-dev commented Mar 21, 2026

Uh oh!

loci-review bot commented Mar 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

loci-dev commented Mar 21, 2026

Uh oh!

loci-review bot commented Mar 21, 2026

Additional Findings

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants