azula.denoise

Denoisers, parametrizations and training objectives.

For a distribution \(p(X)\) over \(\mathbb{R}^D\) and a perturbation kernel

\[p(X_t \mid X) = \mathcal{N}(X_t \mid \alpha_t X, \sigma_t^2 I) \, ,\]

the goal of a denoiser is to predict \(X\) given \(X_t\), that is to infer the posterior distribution

\[p(X \mid X_t) = \frac{p(X) \, p(X_t \mid X)}{p(X_t)} \, .\]

A denoiser is therefore an approximation \(q_\phi(X \mid X_t)\) of the posterior \(p(X \mid X_t)\) and its objective is to minimize the Kullback-Leibler (KL) divergence

\[\begin{split}& \arg \min_\phi \mathbb{E}_{x_t \,\sim\, p(X_t)} \big[ \mathrm{KL}( p(X \mid x_t) \parallel q_\phi(X \mid x_t) ) \big] \\ = \, & \arg \min_\phi \mathbb{E}_{x, x_t \,\sim\, p(X, X_t)} \big[ -\log q_\phi(x \mid x_t) \big] \, .\end{split}\]

For most use cases, it is enough to estimate the mean \(\mathbb{E}[X \mid x_t]\) of the posterior \(p(X \mid x_t)\), in which case \(q_\phi(X \mid x_t)\) should have mean \(\mu_\phi(x_t) \approx \mathbb{E}[X \mid x_t]\).

Classes

Posterior

Abstract posterior \(q_\phi(X \mid x_t)\).

DiracPosterior

Creates a Dirac delta posterior \(\delta(X - \mu)\).

GaussianPosterior

Creates a Gaussian posterior \(\mathcal{N}(X \mid \mu, \sigma^2)\).

Denoiser

Abstract denoiser module.

GaussianDenoiser

Creates an analytical Gaussian denoiser.

SimpleDenoiser

Creates a denoiser with simple preconditioning.

KarrasDenoiser

Creates a Gaussian denoiser with EDM-style preconditioning.

Descriptions

class azula.denoise.Posterior[source]

Abstract posterior \(q_\phi(X \mid x_t)\).

class azula.denoise.DiracPosterior(mean)[source]

Creates a Dirac delta posterior \(\delta(X - \mu)\).

Parameters:

mean (Tensor) – The mean \(\mu\), with shape \((*)\).

class azula.denoise.GaussianPosterior(mean, var)[source]

Creates a Gaussian posterior \(\mathcal{N}(X \mid \mu, \sigma^2)\).

Parameters:
  • mean (Tensor) – The mean \(\mu\), with shape \((*)\).

  • var (Tensor) – The variance \(\sigma^2\), with shape \((*)\).

log_prob(x)[source]
Parameters:

x (Tensor) – A tensor \(x\), with shape \((*)\).

Returns:

The log-density \(\log \mathcal{N}(x \mid \mu, \sigma^2)\), with shape \((*)\).

Return type:

Tensor

class azula.denoise.Denoiser(*args, **kwargs)[source]

Abstract denoiser module.

abstract forward(x_t, t, **kwargs)[source]
Parameters:
  • x_t (Tensor) – A noisy tensor \(x_t\), with shape \((B, *)\).

  • t (Tensor) – The time \(t\), with shape \(()\) or \((B)\).

  • kwargs – Optional keyword arguments.

Returns:

The posterior \(q_\phi(X \mid x_t)\).

Return type:

Posterior

class azula.denoise.GaussianDenoiser(mean, cov, schedule)[source]

Creates an analytical Gaussian denoiser.

Let \(X \sim \mathcal{N}(\mu_x, \Sigma_x)\) and \(X_t \sim \mathcal{N}(X, \Sigma_t)\), then

\[X \mid X_t \sim \mathcal{N} \left( \bar{\mu}, \bar{\Sigma} \right)\]

with

\[\begin{split}\bar{\mu} & = \mu_x + \Sigma_x \left( \Sigma_x + \Sigma_t \right)^{-1} (X_t - \mu_x) \\ \bar{\Sigma} & = \left( \Sigma_x^{-1} + \Sigma_t^{-1} \right)^{-1}\end{split}\]

References

Bayesian Filtering and Smoothing (Särkkä, 2013)
Parameters:
  • mean (Tensor) – The mean vector \(\mu_x\), with shape \((N_1, ..., N_d)\).

  • cov (Covariance) – The covariance matrix \(\Sigma_x\), with shape \((N_1, ..., N_d, N_1, \dots, N_d)\).

forward(x_t, t, **kwargs)[source]
Parameters:
  • x_t (Tensor) – A noisy tensor \(x_t\), with shape \((B, *)\).

  • t (Tensor) – The time \(t\), with shape \(()\).

  • kwargs – Optional keyword arguments.

Returns:

The Dirac delta \(\delta(X - \bar{\mu})\).

Return type:

DiracPosterior

class azula.denoise.SimpleDenoiser(backbone, schedule)[source]

Creates a denoiser with simple preconditioning.

\[\mu_\phi(x_t) = b_\phi(c_\mathrm{in}(t) \, x_t, c_\mathrm{time}(t))\]

The preconditioning coefficients make the backbone independent of the noise schedule. Therefore, the latter can be replaced at any time.

\[\begin{split}c_\mathrm{in}(t) & = \frac{1}{\sqrt{\alpha_t^2 + \sigma_t^2}} \\ c_\mathrm{time}(t) & = \log \frac{\sigma_t}{\alpha_t}\end{split}\]
Parameters:
  • backbone (Module) – A noise/time conditional network \(b_\phi(x_t, t)\).

  • schedule (Schedule) – A noise schedule.

forward(x_t, t, **kwargs)[source]
Parameters:
  • x_t (Tensor) – A noisy tensor \(x_t\), with shape \((B, *)\).

  • t (Tensor) – The time \(t\), with shape \(()\) or \((B)\).

  • kwargs – Optional keyword arguments.

Returns:

The Dirac delta \(\delta(X - \mu_\phi(x_t))\).

Return type:

DiracPosterior

loss(x, t, max_weight=10000.0, **kwargs)[source]
Parameters:
  • x (Tensor) – A clean tensor \(x\), with shape \((B, *)\).

  • t (Tensor) – The time \(t\), with shape \((B)\).

  • kwargs – Optional keyword arguments.

Returns:

The weighted loss

\[\frac{\alpha_t^2 + \sigma_t^2}{\sigma_t^2} || \mu_\phi(x_t) - x ||^2\]

where \(x_t \sim p(X_t \mid x)\), with shape \((B, *)\).

Return type:

Tensor

class azula.denoise.KarrasDenoiser(backbone, schedule)[source]

Creates a Gaussian denoiser with EDM-style preconditioning.

\[\mu_\phi(x_t) = c_\mathrm{skip}(t) \, x_t + c_\mathrm{out}(t) \, b_\phi(c_\mathrm{in}(t) \, x_t, c_\mathrm{time}(t))\]

The preconditioning coefficients are generalized to take the scale \(\alpha_t\) into account.

\[\begin{split}c_\mathrm{in}(t) & = \frac{1}{\sqrt{\alpha_t^2 + \sigma_t^2}} \\ c_\mathrm{out}(t) & = \frac{\sigma_t}{\sqrt{\alpha_t^2 + \sigma_t^2}} \\ c_\mathrm{skip}(t) & = \frac{\alpha_t}{\alpha_t^2 + \sigma_t^2} \\ c_\mathrm{time}(t) & = \log \frac{\sigma_t}{\alpha_t}\end{split}\]

References

Elucidating the Design Space of Diffusion-Based Generative Models (Karras et al., 2022)
Parameters:
  • backbone (Module) – A noise/time conditional network \(b_\phi(x_t, t)\).

  • schedule (Schedule) – A noise schedule.

forward(x_t, t, **kwargs)[source]
Parameters:
  • x_t (Tensor) – A noisy tensor \(x_t\), with shape \((B, *)\).

  • t (Tensor) – The time \(t\), with shape \(()\) or \((B)\).

  • kwargs – Optional keyword arguments.

Returns:

The Dirac delta \(\delta(X - \mu_\phi(x_t))\).

Return type:

DiracPosterior

loss(x, t, **kwargs)[source]
Parameters:
  • x (Tensor) – A clean tensor \(x\), with shape \((B, *)\).

  • t (Tensor) – The time \(t\), with shape \((B)\).

  • kwargs – Optional keyword arguments.

Returns:

The weighted loss

\[\frac{\alpha_t^2 + \sigma_t^2}{\sigma_t^2} || \mu_\phi(x_t) - x ||^2\]

where \(x_t \sim p(X_t \mid x)\), with shape \((B, *)\).

Return type:

Tensor