作者: Mengdi Wang , Ethan X. Fang , Han Liu
DOI: 10.1007/S10107-016-1017-3
关键词:
摘要: Classical stochastic gradient methods are well suited for minimizing expected-value objective functions. However, they do not apply to the minimization of a nonlinear function involving expected values or composition two functions, i.e., problem $$\min _x \mathbf{E}_v\left[ f_v\big (\mathbf{E}_w [g_w(x)]\big ) \right] .$$minxEvfv(Ew[gw(x)]). In order solve this problem, we propose class compositional descent (SCGD) algorithms that can be viewed as versions quasi-gradient method. SCGD update solutions based on noisy sample gradients $$f_v,g_{w}$$fv,gw and use an auxiliary variable track unknown quantity $$\mathbf{E}_w\left[ g_w(x)\right] $$Ewgw(x). We prove converge almost surely optimal solution convex optimization problems, long such exists. The convergence involves interplay iterations with different time scales. For nonsmooth achieves rate $$\mathcal {O}(k^{-1/4})$$O(k-1/4) in general case {O}(k^{-2/3})$$O(k-2/3) strongly case, after taking k samples. smooth accelerated at {O}(k^{-2/7})$$O(k-2/7) {O}(k^{-4/5})$$O(k-4/5) case. nonconvex any limit point generated by is stationary point, which also provide analysis. Indeed, setting where one wants optimize compositions functions very common practice. proposed find wide applications learning, estimation, dynamic programming, etc.