azula.denoise

Denoisers, parametrizations and training objectives.

For a distribution \(p(X)\) over \(\mathbb{R}^D\) and a perturbation kernel

\[p(X_t \mid X) = \mathcal{N}(X_t \mid \alpha_t X, \sigma_t^2 I) \, ,\]

the goal of a denoiser is to predict \(X\) given \(X_t\), that is to infer the posterior distribution

\[p(X \mid X_t) = \frac{p(X) \, p(X_t \mid X)}{p(X_t)} \, .\]

A denoiser is therefore an approximation \(q_\phi(X \mid X_t)\) of the posterior \(p(X \mid X_t)\) and its objective is to minimize the Kullback-Leibler (KL) divergence

\[\begin{split}& \arg \min_\phi \mathbb{E}_{x_t \,\sim\, p(X_t)} \big[ \mathrm{KL}( p(X \mid x_t) \parallel q_\phi(X \mid x_t) ) \big] \\ = \, & \arg \min_\phi \mathbb{E}_{x, x_t \,\sim\, p(X, X_t)} \big[ -\log q_\phi(x \mid x_t) \big] \, .\end{split}\]

Classes

Gaussian

Creates a Gaussian distribution \(\mathcal{N}(\mu, \sigma^2)\).

GaussianDenoiser

Abstract Gaussian denoiser module.

PreconditionedDenoiser

Creates a Gaussian denoiser with EDM-style preconditioning.

Descriptions

class azula.denoise.Gaussian(mean, var)

Creates a Gaussian distribution \(\mathcal{N}(\mu, \sigma^2)\).

Parameters:
  • mean (Tensor) – The mean \(\mu\), with shape \((*)\).

  • var (Tensor) – The variance \(\sigma^2\), with shape \((*)\).

log_prob(x)
Parameters:

x (Tensor) – A tensor \(x\), with shape \((*)\).

Returns:

The log-density \(\log \mathcal{N}(x \mid \mu, \sigma^2)\), with shape \((*)\).

Return type:

Tensor

class azula.denoise.GaussianDenoiser(*args, **kwargs)

Abstract Gaussian denoiser module.

\[q_\phi(X \mid X_t) = \mathcal{N}(X \mid \mu_\phi(X_t), \Sigma_\phi(X_t))\]

The optimal Gaussian denoiser estimates the mean \(\mathbb{E}[X \mid X_t]\) and covariance \(\mathbb{V}[X \mid X_t]\) of the posterior \(p(X \mid X_t)\).

abstract forward(x_t, t, **kwargs)
Parameters:
  • x_t (Tensor) – A noisy tensor \(x_t\), with shape \((*, S)\).

  • t (Tensor) – The time \(t\), with shape \((*)\).

  • kwargs – Optional keyword arguments.

Returns:

The Gaussian \(\mathcal{N}(X \mid \mu_\phi(x_t), \Sigma_\phi(x_t))\).

Return type:

Gaussian

class azula.denoise.PreconditionedDenoiser(backbone, schedule, var_x=1.0)

Creates a Gaussian denoiser with EDM-style preconditioning.

\[\begin{split}\mu_\phi(x_t) & = c_\mathrm{skip}(t) \, x_t + c_\mathrm{out}(t) \, b_\phi(c_\mathrm{in}(t) \, x_t, c_\mathrm{noise}(t)) \\ \sigma^2_\phi(x_t) & = \frac{\sigma_x^2 \, \sigma_t^2}{\sigma_x^2 \, \alpha_t^2 + \sigma_t^2}\end{split}\]

The preconditioning coefficients are generalized to take the scale \(\alpha_t\) into account.

\[\begin{split}c_\mathrm{in}(t) & = \frac{1}{\sqrt{\alpha_t^2 + \sigma_t^2}} \\ c_\mathrm{out}(t) & = \frac{\sigma_t}{\sqrt{\alpha_t^2 + \sigma_t^2}} \\ c_\mathrm{skip}(t) & = \frac{\alpha_t}{\alpha_t^2 + \sigma_t^2} \\ c_\mathrm{noise}(t) & = \log \frac{\sigma_t}{\alpha_t}\end{split}\]

References

Elucidating the Design Space of Diffusion-Based Generative Models (Karras et al., 2022)
Parameters:
  • backbone (Module) – A noise conditional network \(b_\phi(x_t, \sigma_t)\).

  • schedule (Schedule) – A noise schedule.

  • var_x (Tensor) – The signal variance \(\sigma_x^2\).

loss(x, t, **kwargs)
Parameters:
  • x (Tensor) – A clean tensor \(x\), with shape \((*, S)\).

  • t (Tensor) – The time \(t\), with shape \((*)\).

  • kwargs – Optional keyword arguments.

Returns:

The weighted loss

\[\frac{1}{\sigma^2_\phi(x_t)} || \mu_\phi(x_t) - x ||^2\]

where \(x_t \sim p(X_t \mid x)\), with shape \((*, S)\).

Return type:

Tensor