Chapter 10 Regularization Methods

The regularization method is also known as the shrinkage method. It is a technique that constrains or regularizes the coefficient estimates. By imposing a penalty on the size of coefficients, it shrinks the coefficient estimates towards zero. It also intrinsically conduct feature selection and is naturally resistant to non-informative predictors. It may not be obvious why this technique improves model performance, but it turns out to be a very effective modeling technique. In this chapter, we will introduce two best-known regularization methods: ridge regression and lasso. The elastic net is a combination of ridge and lasso, or it is a general representation of the two.

We talked about the variance bias trade-off in section 7.1. The variance of a learning model is the amount by which $\hat{f}$ would change if we estimated it using a different training data set. In general, model variance increases as flexibility increases. The regularization technique decreases the model flexibility by shrinking the coefficient and hence significantly reduce the model variance.

Load the R packages:

# install packages from CRAN
p_needed <- c('caret', 'elasticnet', 'glmnet', 'devtools',
              'MASS', 'grplasso')

packages <- rownames(installed.packages())
p_to_install <- p_needed[!(p_needed %in% packages)]
if (length(p_to_install) > 0) {
    install.packages(p_to_install)
}

lapply(p_needed, require, character.only = TRUE)

# install packages from GitHub
p_needed_gh <- c('NetlifyDS')

if (! p_needed_gh %in% packages) {
    devtools::install_github("netlify/NetlifyDS")
}

library(NetlifyDS)