Citation: | BI Shujun, YIN Zhen. Study on A New Nonconvex Surrogate of Zero-Norm Function and Its Application[J]. Journal of South China Normal University (Natural Science Edition), 2023, 55(6): 98-108. DOI: 10.6054/j.jscnun.2023083 |
Based on the SCAD (Smoothly Clipped Absolute Deviation) function and the elastic net function, a new zero-norm nonconvex surrogate function(EN-SCAD function) is proposed, which is the difference between the elastic net function and a continuous differentiable convex function, thus is a DC (difference of convex) function. Then, the EN-SCAD function is applied to the sparse linear regression problem, and the EN-SCAD nonconvex surrogate model is established, and the statistical error bound between the stationary point of the model and the true sparse vector is obtained under appropriate restricted strong convex condition. Secondly, a multi-stage convex relaxation algorithm is designed according to the EN-SCAD nonconvex surrogate model, and obtain the statistical error bounds between the columns of iterative points generated by the algorithm and the true sparse vector. Finally, comparing the numerical effects of the algorithm designed based on the EN-SCAD nonconvex surrogate model with a convex relaxation method for adaptive elastic net. Numerical experimental results show that when the column vectors of the sampling matrix are strongly correlated, the estimation error produced by the algorithm based on the EN-SCAD nonconvex surrogate model is smaller than that produced by the convex relaxation method for adaptive elastic net.
[1] |
ZHOU S L, XIU N H, QI H D. Global and quadratic convergence of Newton hard-thresholding pursuit[J]. Journal of Machine Learning Research, 2021, 22(12): 1-45.
|
[2] |
LU Z S, ZHANG Y. Sparse approximation via penalty decomposition methods[J]. Optimization Methods and Software, 2015, 30(4): 682-705. doi: 10.1080/10556788.2014.967235
|
[3] |
FAN J Q, LV J C. A selective overview of variable selection in high dimensional feature space[J]. Statistical Science, 2010, 20(1): 101-148.
|
[4] |
NATARAJAN B K. Sparse approximate solutions to linear systems[J]. SIAM Journal on Computing, 1995, 24(2): 227-234. doi: 10.1137/S0097539792240406
|
[5] |
BI S J, PAN S H. Multistage convex relaxation approach to rank regularized minimization problems based on equi-valent mathematical program with a generalized complementarity constraint[J]. SIAM Journal on Control and Optimization, 2017, 55(4): 2493-2518. doi: 10.1137/15M1037160
|
[6] |
DONOHO D L, LOGAN B F. Signal recovery and the large sieve[J]. SIAM Journal on Applied Mathematics, 1992, 52(2): 577-591. doi: 10.1137/0152031
|
[7] |
NESTEROV Y. Gradient methods for minimizing compo-site functions[J]. Mathematical Programming, 2013, 140(1): 125-161. doi: 10.1007/s10107-012-0629-5
|
[8] |
LOCKHART R, TAYLOR J, TIBSHIRANI R J. A significance test for the lasso[J]. Annals of Statistics, 2014, 42(2): 413-468.
|
[9] |
ZOU H, HASTIE T. Regularization and variable selection via the elastic net[J]. Journal of the Royal Statistical Society, 2005, 67(2): 301-320. doi: 10.1111/j.1467-9868.2005.00503.x
|
[10] |
FAN J Q, LI R. Variable selection via nonconcave pena-lized likelihood and its oracle properties[J]. Journal of the American Statistical Association, 2001, 96(456): 1348-1360. doi: 10.1198/016214501753382273
|
[11] |
LOH P L, WAINWRIGHT M J. Regularized M-estimators with nonconvexity: statistical and algorithmic theory for local optima[J]. Journal of Machine Learning Research, 2015, 16(19): 559-616.
|
[12] |
LE THI H A, TAO P D. DC programming and DCA: thirty years of developments[J]. Mathematical Programming, 2018, 169(1): 5-68. doi: 10.1007/s10107-018-1235-y
|
[13] |
LI X D, SUN D F, TOH K C. A highly efficient semismooth Newton augmented lagrangian method for solving lasso problems[J]. SIAM Journal on Optimization, 2016, 28(1): 433-458.
|
[14] |
ZHANG T. Multi-stage convex relaxation for learning with sparse regularization[J]. Neural Information Processing Systems, 2008, 37(6): 763-771.
|
[15] |
NEGAHBAN S, RAVIKUMAR P, WAINWRIGHT M, et al. A unified framework for high-dimensional analysis of M-estimators with decomposable regularizers[J]. Statistical Science, 2012, 27(4): 538-557.
|
[16] |
SAMIRAN G. On the grouped selection and model complexity of the adaptive elastic net[J]. Statistics and Computing, 2011, 21(3): 451-462. doi: 10.1007/s11222-010-9181-4
|