Nikola Janjušević

ADMM for a 2-set "best approximation problem"

The following comes from my own derivations, but the results can be see in [1] equation (7).

Suppose we have a vector and want to project it onto to intersection of two sets C1C2C_1 \cap C_2. This projection may not have a closed form, but if projection onto each set separately is cheap/closed-form, we're in luck. We begin by converting our starting formulation to something ADMM-able via indicator functions,

minimize x C1C2 xa22minimize x, z 12xa22+ιC1(x)+ιC2(z)subject to xz=0\begin{array}{rcl} &\underset{x \, \in \, C_1 \cap C_2}{\mathrm{minimize}\,}\, && \lVert x - a \rVert_2^2 \\ & \Leftrightarrow && \\ &\underset{x,\, z}{\mathrm{minimize}\,}\, && \frac{1}{2}\lVert x-a \rVert_2^2 + \iota_{C_1}(x) + \iota_{C_2}(z) \\ &\text{subject to}\, && x-z = 0 \end{array}

Our scaled-form augmented Lagrangian is then,

Lρ(x,z,u)=12xa22+ιC1(x)+ιC2(z)+ρ2xz+u22+ρ2u22\begin{array}{rcl} L_{\rho}(x,z,u) &=& \frac{1}{2}\lVert x - a \rVert_2^2 + \iota_{C_1}(x) + \iota_{C_2}(z) + \frac{\rho}{2}\lVert x-z+u \rVert_2^2 + \frac{\rho}{2}\lVert u \rVert_2^2 \end{array}

with scaled dual variable uu. From the Lagrangian we can see that our zz-update is straightforward[2], but the xx-update requires some manipulation. If we complete the square w.r.t xx,

Lρ(x,z,u)=12(xTx2xTa)+ιC1(x)+ρ2(xTx2xT(zu))+const=1+ρ2(xTx21+ρxT(a+ρ(zu))+1(1+ρ)2a+ρ(zu)22)+ιC1(x)+const=ιC1(x)+1+ρ2xa+ρ(zu)1+ρ22+const\begin{array}{rcl} L_{\rho}(x,z,u) &=& \frac{1}{2}(x^Tx -2x^Ta) + \iota_{C_1}(x) + \frac{\rho}{2}( x^Tx -2x^T(z-u)) + \mathrm{const} \\ &=& \frac{1+\rho}{2}\left( x^Tx -\frac{2}{1+\rho} x^T(a + \rho(z-u)) + \frac{1}{(1+\rho)^2}\lVert a + \rho(z-u) \rVert_2^2\right) + \iota_{C_1}(x) + \mathrm{const} \\ &=& \iota_{C_1}(x) + \frac{1+\rho}{2}\lVert x - \frac{a + \rho(z-u)}{1+\rho} \rVert_2^2 + \mathrm{const} \end{array}

we see that its update is also a projection, with inputs scaled by the step-size ρ\rho (note const\mathrm{const} is only a constant w.r.t xx).

The resulting ADMM algorithm for the 2-set best approximation problem is as follows, with z(0)=projC2(a)z^{(0)} = \mathrm{proj}_{C_2}(a), u(0)=0u^{(0)}=0,

xk+1projC1(a+ρ(zkuk)1+ρ)zk+1projC2(xk+1+uk)uk+1uk+xk+1zk+1\begin{array}{rcl} x^{k+1} &\coloneqq& \mathrm{proj}_{C_1}\left(\frac{a + \rho(z^k-u^k)}{1+\rho}\right) \\ z^{k+1} &\coloneqq& \mathrm{proj}_{C_2}(x^{k+1} + u^k) \\ u^{k+1} &\coloneqq& u^k + x^{k+1} - z^{k+1} \end{array}

This is reminiscent of Dykstra's projection algorithm and POCS, but different from Dykstra's we have a step-size ρ\rho to play with for accelerated convergence.

[1] See "Dykstra's Algorithm, ADMM, and Coordinate Descent: Connections, Insights, and Extensions" for further insights and extensions. Notably, the dd-set best approximation problem.
[2] The proximal operator of indicator function of a set CC is orthogonal-projection onto CC.