5 minute read

This post introduces the core ideas our paper “A Unified Bayesian Inference Framework for Generalized Linear Models” (IEEE Signal Processing Letters, 2018, arXiv: https://arxiv.org/abs/1712.10288) and shows how it recovers GAMP and extends to GVAMP, SBL, etc., through a clean EP-based decoupling.

Approximate Message Passing (AMP) is widely used for Bayesian inference in standard linear models (SLMs), and its generalized form GAMP extends to generalized linear models (GLMs). But classical derivations of GAMP rely on loopy belief propagation and a sequence of approximations that obscure the algorithm’s structure and generality. This article presents a unified, decoupled framework for GLM inference that is academically rigorous yet simple: each GLM iteration splits into (i) an EP site update on the nonlinear channel and (ii) an SLM solver on the linear transform—communicating solely via extrinsic (site) Gaussian messages. With AMP as the SLM solver, the framework exactly reproduces GAMP; substituting VAMP or SBL yields GLM-VAMP (GVAMP) or Gr-SBL immediately. The perspective clarifies the role of Onsager terms/state evolution, explains robustness trade-offs across matrix ensembles, and suggests principled implementation choices for numerical stability.


From SLM to GLM: Problem Setup

Standard Linear Model (SLM)

We observe that

\[\mathbf y=\mathbf A\mathbf x+\mathbf w,\qquad \mathbf w\sim\mathcal N(\mathbf 0,\sigma^2\mathbf I),\]

with prior $p(\mathbf x)$. AMP iterates (schematically)

\[\mathbf r^{t}=\mathbf y-\mathbf A\mathbf x^{t}+\tfrac{1}{n}\mathbf r^{t-1}\!\left\langle \eta'(\cdot)\right\rangle,\quad \mathbf x^{t+1}=\eta\!\left(\mathbf A^\top \mathbf r^{t}+\mathbf x^{t}\right),\]

where $\eta$ is the MMSE/MAP denoiser induced by $p(\mathbf x)$, and the Onsager correction enforces the scalar state evolution (SE) under i.i.d. Gaussian $\mathbf A$. Strengths: simplicity and speed; limitations: fragility under non-i.i.d./ill-conditioned $\mathbf A$.

Generalized Linear Model (GLM)

Alt text,in many applications the observation is nonlinear:

\[\mathbf z=\mathbf A\mathbf x,\qquad p(\mathbf y\mid \mathbf z)=\prod_{a=1}^M p(y_a\mid z_a),\]

e.g., 1-bit CS $y_a=\text{sign}(z_a)$, logistic regression $y_a\sim\mathrm{Bernoulli}(\sigma(z_a))$, Poisson regression $y_a\sim\mathrm{Poisson}(e^{z_a})$. The nonlinearity destroys Gaussian closure, complicating exact Bayesian inference.


Why GAMP’s Classical Derivation Feels Opaque

Rangan’s GAMP is obtained by loopy BP on the GLM factor graph with Gaussian approximations, entangling the linear transform and nonlinear channel and requiring several expansions/approximations. The result is powerful but hard to parse structurally.


The Core Idea: A Decoupled EP View

We decouple each GLM iteration into two modules A and B that pass Gaussian extrinsic messages for the latent linear output $\mathbf z$.

Module B (Channel EP site update)

Given a cavity (extrinsic) Gaussian for each $z_a$:

\[z_a \sim \mathcal N(p_a,\tau_{p,a}),\]

form the tilted distribution

\[\tilde p(z_a)\ \propto\ p(y_a\mid z_a)\,\mathcal N(z_a;p_a,\tau_{p,a}),\]

compute moment matches

\[m_{B,a}=\mathbb E_{\tilde p}[z_a],\qquad v_{B,a}=\text{Var}_{\tilde p}[z_a],\]

and obtain the site (extrinsic) parameters via Gaussian division:

\[\tau_{\mathrm{ext},a}=(\frac{1}{v_{B,a}}-\frac{1}{\tau_{p,a}})^{-1},\qquad p_{\mathrm{ext},a}=\tau_{\mathrm{ext},a}(\frac{m_{B,a}}{v_{B,a}}-\frac{p_a}{\tau_{p,a}}).\]

This is exactly an EP site update on the channel factor.

Module A (SLM global update under Gaussian pseudo-observations)

Treat $(p_{\mathrm{ext}},\tau_{\mathrm{ext}})$ as Gaussian pseudo-measurements for $\mathbf z$:

\[\tilde{\mathbf{z}} = \mathbf{A} \mathbf{x} + \tilde{\mathbf{w}}, \quad \tilde{\mathbf{w}} \sim \mathcal{N}(0, \text{diag}(\boldsymbol{\tau}_{\mathrm{ext}})).\]

Now run any SLM inference engine (e.g., AMP, VAMP, SBL) to produce a Gaussian approximation for $\mathbf z$, say $(m_A,v_A)$. Convert it back to a new cavity for Module B via Gaussian division:

\[\tau_p=(\frac{1}{v_A}-\frac{1}{\tau_{\mathrm{ext}}})^{-1},\qquad p=\tau_p(\frac{m_A}{v_A}-\frac{p_{\mathrm{ext}}}{\tau_{\mathrm{ext}}}).\]

Iterate A↔B until convergence. With AMP in Module A, this reduces exactly to GAMP; with VAMP (robust under right-orthogonally invariant matrices) you obtain GVAMP; plugging in SBL yields Gr-SBL.

Takeaway. GAMP = (EP site update + Gaussian division) + AMP on an SLM. The EP lens reveals the structure and portability of the algorithm family.


Algorithmic Pseudocode (One Sweep)

Input: A, y, prior p(x), channel p(y|z)
Initialize: p (mean of z), τ_p (var of z), and any x-initialization for the SLM solver

repeat
  # Module B: Channel EP site updates (elementwise)
  for a = 1..M:
    1) Form tilted   $\tilde{p}(z_a) \propto p(y_a | z_a) \cdot \mathcal{N}(z_a; p_a, \tau_{p,a})$
    2) Moment match  $m_{B,a} = \mathbb{E}[z_a | \tilde{p}]$,  $v_{B,a} = \operatorname{Var}[z_a | \tilde{p}]$
    3) Site (extrinsic) Gaussian via division:
         $\tau_{\mathrm{ext},a} = (1/v_{B,a} - 1/\tau_{p,a})^{-1}$
         $p_{\mathrm{ext},a} = \tau_{\mathrm{ext},a} \cdot (m_{B,a}/v_{B,a} - p_a/\tau_{p,a})$

  # Module A: SLM update with Gaussian pseudo-observations on z
  Treat (p_ext, τ_ext) as $\mathbf{z} = \mathbf{A}\mathbf{x} + \tilde{\mathbf{w}}$, with $\tilde{\mathbf{w}} \sim \mathcal{N}(\mathbf{0}, \operatorname{diag}(\boldsymbol{\tau}_{\mathrm{ext}}))$
  Run an SLM solver of choice:
    - AMP  → reproduces classical GAMP (fast for i.i.d. Gaussian A)
    - VAMP → robust for right-orthogonally invariant A (GVAMP)
    - SBL  → Gr-SBL
  Obtain Gaussian approx for z: $(m_A, v_A)$

  # Build next cavity for B via Gaussian division
  $\boldsymbol{\tau}_p = (1/\mathbf{v}_A - 1/\boldsymbol{\tau}_{\mathrm{ext}})^{-1}$
  $\mathbf{p}   = \boldsymbol{\tau}_p \cdot (\mathbf{m}_A/\mathbf{v}_A - \mathbf{p}_{\mathrm{ext}}/\boldsymbol{\tau}_{\mathrm{ext}})$

until convergence

Return: Approximate posterior(s) for x (and z) from the SLM block.

Why This Simplifies GAMP

The decoupled view localizes the nonlinearity to elementwise EP on the channel (moment matching + Gaussian division) and delegates the rest to a familiar SLM engine. It makes clear that the Onsager/SE machinery sits entirely in the SLM solver (AMP/VAMP), while the channel side is a clean EP site. This equivalence is shown formally in [Meng–Wu–Zhu, 2018], which proves the Gr-AMP instance equals GAMP.


Practical Notes & Numerical Stability

  • Channel integrals. For probit/erf-like likelihoods (e.g., 1-bit quantization) compute moments with log-CDF/erfcx tricks to avoid catastrophic cancellation.
  • Matrix ensembles. - If $\mathbf A$ is i.i.d. sub-Gaussian, AMP (hence GAMP) is fast and accurately tracked by SE.
    • If $\mathbf A$ is ill-conditioned or correlated, prefer VAMP/GVAMP (right-orthogonally invariant guarantees).
  • Damping & scheduling. Damping or swept updates can help in finite dimensions; several convergent GAMP variants (e.g., ADMM-GAMP) are available when robustness is paramount.

Connections & Context

  • Expectation Propagation (EP). The framework is a textbook application of EP: site (Module B) ↔ global Gaussian approximation (Module A on the SLM) with extrinsic messaging to prevent double-counting.
  • GVAMP and beyond. Replacing AMP with VAMP produces GLM-VAMP (GVAMP) and inherits its robustness; EM-GVAMP and other extensions tune unknown likelihood/prior parameters.

Worked GLM Examples (Likelihoods)

For each likelihood, Module B reduces to scalar moment computations under the tilted density $\tilde p(z)\propto p(y\mid z)\,\mathcal N(z;p,\tau_p)$:

  • 1-bit CS (probit): $y\in{\pm1},\ p(y\mid z)=\Phi(yz/\sigma)$.
  • Logistic regression: $p(y\mid z)=\sigma(z)^y[1-\sigma(z)]^{1-y}$.
  • Poisson regression: $p(y\mid z)=\exp(yz-e^{z})/y!$.
    These plug-and-play moments are then converted to Gaussian sites; the linear block stays unchanged.

When to Use Which Solver in Module A

Property of $\mathbf A$ Recommended SLM Solver Notes
i.i.d. sub-Gaussian, large $n$ AMP (→ GAMP) Fastest; SE-tracked.
Right-orthogonally invariant / ill-conditioned VAMP (→ GVAMP) Robust; also SE-tracked under broader ensembles.
Emphasize sparsity hyper-parameters SBL (→ Gr-SBL) EM/ARD style evidence updates; slower but flexible.

References

[1] X. Meng, S. Wu, J. Zhu, “A Unified Bayesian Inference Framework for Generalized Linear Models,” IEEE Signal Processing Letters, 25(3):398–402, 2018. DOI: 10.1109/LSP.2017.2789163; arXiv:1712.10288.

[2] T. P. Minka, “Expectation Propagation for Approximate Bayesian Inference,” UAI, 2001; also Tech. Rep., Microsoft Research, 2001.

[3] D. L. Donoho, A. Maleki, A. Montanari, “Message Passing Algorithms for Compressed Sensing,” PNAS, 106(45):18914–18919, 2009. DOI: 10.1073/pnas.0909892106.

[4] S. Rangan, “Generalized Approximate Message Passing for Estimation with Random Linear Mixing,” Proc. IEEE ISIT, 2011; arXiv:1010.5141.

[5] P. Schniter, S. Rangan, A. K. Fletcher, “Vector Approximate Message Passing for the Generalized Linear Model,” Proc. Asilomar Conf. on Signals, Systems, and Computers, pp. 1525–1529, 2016. DOI: 10.1109/ACSSC.2016.7869633; arXiv:1612.01186.

[6] S. Rangan, P. Schniter, A. K. Fletcher, “Vector Approximate Message Passing,” arXiv:1610.03082.

[7] S. Rangan, A. K. Fletcher, P. Schniter, U. Kamilov, “Inference for Generalized Linear Models via Alternating Directions and Bethe Free Energy Minimization (ADMM-GAMP),” Proc. IEEE ISIT, 2015.