Quantile lasso — quantile_lasso • quantgen

Compute quantile lasso solutions.

quantile_lasso(
  x,
  y,
  tau,
  lambda,
  weights = NULL,
  no_pen_vars = c(),
  intercept = TRUE,
  standardize = TRUE,
  lb = -Inf,
  ub = Inf,
  noncross = FALSE,
  x0 = NULL,
  lp_solver = c("glpk", "gurobi"),
  time_limit = NULL,
  warm_starts = TRUE,
  params = list(),
  transform = NULL,
  inv_trans = NULL,
  jitter = NULL,
  verbose = FALSE
)

Arguments

x: Matrix of predictors. If sparse, then passing it an appropriate sparse Matrix class can greatly help optimization.
y: Vector of responses.
tau, lambda: Vectors of quantile levels and tuning parameter values. If these are not of the same length, the shorter of the two is recycled so that they become the same length. Then, for each i, we solve a separate quantile lasso problem at quantile level tau[i] and tuning parameter value lambda[i]. The most common use cases are: specifying one tau value and a sequence of lambda values; or specifying a sequence of tau values and one lambda value.
weights: Vector of observation weights (to be used in the loss function). Default is NULL, which is interpreted as a weight of 1 for each observation.
no_pen_vars: Indices of the variables that should be excluded from the lasso penalty. Default is c(), which means that no variables are to be excluded.

Value

A list with the following components:

beta: Matrix of lasso coefficients, of dimension = (number of features + 1) x (number of quantile levels) assuming intercept=TRUE, else (number of features) x (number of quantile levels). Note: these coefficients will always be on the appropriate scale; they are always on the scale of original features, even if standardize=TRUE
status: Vector of status flags returned by Gurobi's or GLPK's LP solver, of length = (number of quantile levels)
tau,lambda: Vectors of tau and lambda values used
weights,no_pen_vars,...,jitter: Values of these other arguments used in the function call

Details

This function solves the quantile lasso problem, for each pair of quantile level $\tau$ and tuning parameter $\lambda$: $$\mathop{\mathrm{minimize}}_{\beta_0,\beta} \; \sum_{i=1}^n w_i \psi_\tau(y_i-\beta_0-x_i^T\beta) + \lambda \|\beta\|_1$$ for a response vector $y$ with components $y_i$, and predictor matrix $X$ with rows $x_i$. Here $\psi_\tau(v) = \max\{\tau v, (\tau-1) v\}$ is the "pinball" or "tilted $\ell_1$" loss. When noncrossing constraints are applied, we instead solve one big joint optimization, over all quantile levels and tuning parameter values: $$\mathop{\mathrm{minimize}}_{\beta_{0k}, \beta_k, k=1,\ldots,r} \; \sum_{k=1}^r \bigg(\sum_{i=1}^n w_i \psi_{\tau_k}(y_i-\beta_{0k}- x_i^T\beta_k) + \lambda_k \|\beta_k\|_1\bigg)$$ $$\mathrm{subject \; to} \;\; \beta_{0k}+x^T\beta_k \leq \beta_{0,k+1}+x^T\beta_{k+1} \;\; k=1,\ldots,r-1, \; x \in \mathcal{X}$$ where the quantile levels $\tau_j, j=1,\ldots,k$ are assumed to be in increasing order, and $\mathcal{X}$ is a collection of points over which to enforce the noncrossing constraints.

Either problem is readily converted into a linear program (LP), and solved using either Gurobi (which is free for academic use, and generally fast) or GLPK (which free for everyone, but slower).

All arguments not described above are as in the quantile_genlasso function. The associated coef and predict functions are just those for the quantile_genlasso class.

Author

Ryan Tibshirani