Abstract
We consider a high-dimensional monotone single index model (hdSIM), which is a semiparametric extension of a high-dimensional generalize linear model (hdGLM), where the link function is unknown, but constrained with monotone non-decreasing shape. We develop a scalable projection-based iterative approach, the “Sparse Orthogonal Descent SingleIndex Model” (SOD-SIM), which alternates between sparse-thresholded orthogonalized “gradient-like” steps and isotonic regression steps to recover the coefficient vector. Our main contribution is that we provide finite sample estimation bounds for both the coefficient vector and the link function in high-dimensional settings under very mild assumptions on the design matrix X, the error term ɛ, and their dependence. The convergence rate for the link function matches the low-dimensional isotonic regression minimax rate up to some poly-log terms (n−1/3 ). The convergence rate for the coefficients is also n−1/3 up to some poly-log terms. This method can be applied to many real data problems, including GLMs with mis-specified link, classification with mislabeled data, and classification with positive-unlabeled (PU) data. We study the performance of this method via both numerical studies and also an application on a PU data example.
Original language | English (US) |
---|---|
Pages (from-to) | 4449-4496 |
Number of pages | 48 |
Journal | Electronic Journal of Statistics |
Volume | 16 |
Issue number | 2 |
DOIs | |
State | Published - 2022 |
Keywords
- high-dimensional
- isotonic regression
- scalable algorithm
- Single-index model
ASJC Scopus subject areas
- Statistics and Probability
- Statistics, Probability and Uncertainty