LASSO

Taking a look at the diabetes data

library(lars)
## Error in library(lars): there is no package called 'lars'
data(diabetes)
## Warning in data(diabetes): data set 'diabetes' not found
summary(diabetes$x)
## Error in summary(diabetes$x): object 'diabetes' not found

The standardization that has occurred here is that each explanatory variable “has been standardized to have unit L2 norm in each column and zero mean”. So

colSums(diabetes$x^2)
## Error in is.data.frame(x): object 'diabetes' not found

Let’s look at the least squares estimates just to get an idea of the magnitude of effects we are expecting to see.

m = lm(y~x, diabetes)
## Error in is.data.frame(data): object 'diabetes' not found
m
## Error in eval(expr, envir, enclos): object 'm' not found

Now we can look at the LASSO trace.

plot(l <- lars(diabetes$x, diabetes$y))
## Error in lars(diabetes$x, diabetes$y): could not find function "lars"

Here the x-axis represents a re-scaled penalty parameter and the y-axis is the estimated coefficient with that penalty. The vertical lines indicate the times when a variable comes in or goes out of the model. Since there are only 10 covariates and 12 vertical lines, one variables comes in and out of the model. This can be verified by printing the lars object (below) where we can see hdl comes into the model at step 4, out at step 11, and back in at step 12.

l
## Error in eval(expr, envir, enclos): object 'l' not found

Bayesian LASSO

Now we take a look at the Bayesian LASSO as implemented in the package monomvn.

library(monomvn)
## Error in library(monomvn): there is no package called 'monomvn'
attach(diabetes)
## Error in attach(diabetes): object 'diabetes' not found
## Ordinary Least Squares regression
reg.ols <- regress(x, y)
## Error in regress(x, y): could not find function "regress"
## Lasso regression with the penalty choosen by leave-one-out cross validation
reg.las <- regress(x, y, method="lasso", validation="LOO")
## Error in regress(x, y, method = "lasso", validation = "LOO"): could not find function "regress"
# Fully Bayesian LASSO
reg.blas <- blasso(x, y, RJ=FALSE)
## Error in blasso(x, y, RJ = FALSE): could not find function "blasso"
## summarize the beta (regression coefficients) estimates
plot(reg.blas, burnin=200)
## Error in plot(reg.blas, burnin = 200): object 'reg.blas' not found
points(drop(reg.las$b), col=2, pch=20)
## Error in drop(reg.las$b): object 'reg.las' not found
points(drop(reg.ols$b), col=3, pch=18)
## Error in drop(reg.ols$b): object 'reg.ols' not found
legend("topleft", c("blasso-map", "lasso", "lsr"),
       col=c(2,2,3), pch=c(21,20,18))
## Error in strwidth(legend, units = "user", cex = cex, font = text.font): plot.new has not been called yet

This plot provides a comparison betweenthe fully Bayesian answer (provided via boxplots) and the LASSO (with the penalty chosen using LOO x-validation) vs OLS estimates.



blog comments powered by Disqus

Published

12 September 2013

Tags