is a lower bound of primal problem, we hope to maximize this lower bound
For specific λ and μ, we can get an optimal x
For this x∈X, we have the same lower bound
L(λ,μ)=(g(x)−b)λ+(h(x)−d)μ+f(x)
It is a non-differentiable concave functions as follow:
It is impossible for us to enumerate all x∈X, so we can not figure out the overall view of L(λ,μ), to find the "peak", subgradient is a good method to improve the lower bound
Then we update λ and μ, and re-optimize (5) to update x
Loop until reach algorithm end criteria
❗It is crucial to select accurate step size(learning rate), bad step size could lead to divergence