Skip to content

Variance Inflation Factor

calvlaw92 edited this page Feb 21, 2018 · 1 revision

According to Wikipedia:
In statistics, the variance inflation factor (VIF) is the ratio of variance in a model with multiple terms, divided by the variance of a model with one term alone.[1] It quantifies the severity of multicollinearity in an ordinary least squares regression analysis. It provides an index that measures how much the variance (the square of the estimate's standard deviation) of an estimated regression coefficient is increased because of collinearity.

What the numbers mean:
if VIF equal or less than 1: no multicollinearity among factors
if VIF more than 1: factors may have some correlation
if VIF is more than 5: factors show high correlation
if VIF is larger than 10: factors shows high correlation that might be problematic, assume that the regression coefficients are poorly estimated due to multicollinearity. Remove highly correlated factors or use principal component analysis.

To do a multicollinearity check:

  1. download the package called usdm
  2. library(usdm)
  3. vif(yourdataset)

To understand it better:
http://blog.minitab.com/blog/understanding-statistics/handling-multicollinearity-in-regression-analysis

Clone this wiki locally