SDXL VAE

The decoder shares the encoder's attention block, the KV weights.

WF-VAE operates in the wavelet domain, not the pixel domain: the input image is first decomposed via multi-level Haar wavelet transforms, before being processed by the encoder.

References

  • 2411.17459
  • 2510.22852 (Figure 5)
Downloads last month

-

Downloads are not tracked for this model. How to track
Safetensors
Model size
83.1M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support