e-TA 4: Model Selection and Information Criteria
Welcome to e-Tutorial, your on-line help to Econ508. This issue provides an introduction to model selection in Econometrics, focusing on Akaike (AIC) and Schwarz (SIC) Information Criteria.1
Data Set
The data set used in this tutorial was borrowed from Johnston and DiNardo’s Econometric Methods (1997, 4th ed), but slightly adjusted for your needs. It is called AUTO2. You can download the data by visiting the Econ 508 web site (Data). As you will see, this adapted data set contains five series.
auto<-read.table("http://www.econ.uiuc.edu/~econ508/data/AUTO2.txt",header=T)
head(auto)
quarter gas price income miles
1 1959 -8.015 4.676 -4.505 2.648
2 1959 -8.011 4.691 -4.493 2.648
3 1959 -8.020 4.689 -4.499 2.648
4 1959 -8.013 4.722 -4.492 2.648
5 1960 -8.017 4.707 -4.490 2.647
6 1960 -7.976 4.699 -4.489 2.647
As we did before we need to transform the data in “time series” first:
library(dyn)
Loading required package: zoo
Attaching package: 'zoo'
The following objects are masked from 'package:base':
as.Date, as.Date.numeric
gas<-ts(auto$gas,start=1959,frequency=4)
price<-ts(auto$price,start=1959,frequency=4)
income<-ts(auto$income,start=1959,frequency=4)
miles<-ts(auto$miles,start=1959,frequency=4)
Running a Generic Dynamic Models
In the PS2, question 1, for that specific data set (which is different than the one used here) you are asked to run a simple dynamic model in the following autorregressive distributed lag form:
\[gas = a_{0} + a_{1} gas_{t-1} + a_{2} \Delta gas_{t-1} + a_{3} price + a_{4} \Delta price + a_{5} \Delta price_{t-1} + a_{6} income + a_{7} \Delta income + a_{8} \Delta income_{t-1} + \epsilon\]
f<-dyn$lm(gas~lag(gas,-1)+lag(diff(gas),-1)+price+diff(price)+lag(diff(price),-1)+income+diff(income)+lag(diff(income),-1))
summary(f)
Call:
lm(formula = dyn(gas ~ lag(gas, -1) + lag(diff(gas), -1) + price +
diff(price) + lag(diff(price), -1) + income + diff(income) +
lag(diff(income), -1)))
Residuals:
Min 1Q Median 3Q Max
-0.07405 -0.00755 0.00048 0.00775 0.04586
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.09297 0.12155 -0.76 0.446
lag(gas, -1) 0.97219 0.02540 38.27 < 2e-16 ***
lag(diff(gas), -1) -0.17881 0.09073 -1.97 0.051 .
price -0.01830 0.00962 -1.90 0.060 .
diff(price) -0.23593 0.03734 -6.32 4.9e-09 ***
lag(diff(price), -1) 0.05841 0.04459 1.31 0.193
income 0.00828 0.02052 0.40 0.687
diff(income) 0.27223 0.15497 1.76 0.082 .
lag(diff(income), -1) 0.04469 0.15529 0.29 0.774
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.0156 on 117 degrees of freedom
(3 observations deleted due to missingness)
Multiple R-squared: 0.983, Adjusted R-squared: 0.982
F-statistic: 863 on 8 and 117 DF, p-value: <2e-16
The model above is your benchmark. You should now start your model selection process.
Even when there exist commands to calculate the Akaike or the Schwarz criterion, in Econ 508 it is recommended that you compute them by hand, as taught in class, using the formulae given in Prof. Koenker’s Lecture Note 4:
\[AIC=log(\hat{\sigma_j}^2)+\frac{p_i}{n}*2\] \[SIC=log(\hat{\sigma_j}^2)+\frac{p_i}{n}*log(n)\]
In order to calculate the sample size, one could sum the degrees of freedom and the number of estimated parameters, so we have the following:
sample.size<-f$df + length(f$coeff)
Finally:
aic<-log(sum(resid(f)^2)/sample.size)+(length(f$coeff)/sample.size)*2
aic
[1] -8.253
bic<-log(sum(resid(f)^2)/sample.size)+(length(f$coeff)/sample.size)*log(sample.size)
bic
[1] -8.051
Next, we just need to compute variations of this model and the respective aic, sic, or both.
To do so it may be best to build your own AIC and BIC function:
aic<-function(model){
n<-model$df + length(model$coeff)
log(sum(resid(model)^2)/n)+(length(model$coeff)/n)*2
}
aic(f)
[1] -8.253
bic<-function(model){
n<-model$df + length(model$coeff)
log(sum(resid(model)^2)/n)+(length(model$coeff)/n)*log(n)
}
bic(f)
[1] -8.051
Next we specify the models we want to assess
m1 <- gas ~ lag(gas,-1)
m2 <- gas ~ lag(gas,-1) + price
m3 <- gas ~ lag(gas,-1) + price + diff(gas)
m4 <- gas ~ lag(gas,-1) + price + diff(gas) + lag(diff(gas),-1)
And then we build a loop that calculates and saves the AIC and BIC results in a matrix
M <- c(m1,m2,m3,m4)
A <- array(0,c(4,2))
for(i in 1:4){
g <- dyn$lm(M[[i]])
A[i,1] <- aic(g)
A[i,2] <- bic(g)
}
then we check which model minimizes the criterion functions. The which.min()
function “determines the location, i.e., index of the (first) minimum or maximum of a numeric (or logical) vector.”(see help for which.min
function)
A
[,1] [,2]
[1,] -7.959 -7.914
[2,] -7.975 -7.908
[3,] -68.422 -68.332
[4,] -69.066 -68.954
aic_min<-which.min(A[,1])
bic_min<-which.min(A[,2])
model_aic<-dyn$lm(eval(parse(text=paste("m",aic_min,sep=""))))
summary(model_aic)
Warning: essentially perfect fit: summary may be unreliable
Call:
lm(formula = dyn(eval(parse(text = paste("m", aic_min, sep = "")))))
Residuals:
Min 1Q Median 3Q Max
-1.06e-14 -5.50e-17 4.00e-18 2.01e-16 5.91e-16
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.14e-14 6.32e-15 1.80e+00 0.074 .
lag(gas, -1) 1.00e+00 7.57e-16 1.32e+15 <2e-16 ***
price 2.27e-16 5.53e-16 4.10e-01 0.683
diff(gas) 1.00e+00 4.87e-15 2.05e+14 <2e-16 ***
lag(diff(gas), -1) 7.05e-16 4.84e-15 1.50e-01 0.884
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 9.86e-16 on 121 degrees of freedom
(3 observations deleted due to missingness)
Multiple R-squared: 1, Adjusted R-squared: 1
F-statistic: 4.39e+29 on 4 and 121 DF, p-value: <2e-16
model_bic<-dyn$lm(eval(parse(text=paste("m",bic_min,sep=""))))
summary(model_bic)
Warning: essentially perfect fit: summary may be unreliable
Call:
lm(formula = dyn(eval(parse(text = paste("m", bic_min, sep = "")))))
Residuals:
Min 1Q Median 3Q Max
-1.06e-14 -5.50e-17 4.00e-18 2.01e-16 5.91e-16
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.14e-14 6.32e-15 1.80e+00 0.074 .
lag(gas, -1) 1.00e+00 7.57e-16 1.32e+15 <2e-16 ***
price 2.27e-16 5.53e-16 4.10e-01 0.683
diff(gas) 1.00e+00 4.87e-15 2.05e+14 <2e-16 ***
lag(diff(gas), -1) 7.05e-16 4.84e-15 1.50e-01 0.884
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 9.86e-16 on 121 degrees of freedom
(3 observations deleted due to missingness)
Multiple R-squared: 1, Adjusted R-squared: 1
F-statistic: 4.39e+29 on 4 and 121 DF, p-value: <2e-16
The last lines run the model for which the AIC and BIC are minimized.
As we previously mentioned most of functions are already programmed in R, that is the case with AIC and BIC. To obtain them you need to estimate a model and then call the function
n <- length(fitted(f))
AIC(f,k=2)
[1] -680.3
BIC(f)
[1] -652
Note that the number given by the AIC/BIC built in function is different than the programmed by ourselves, to check why you should check how the built in function was created. You can do that by typing ?AIC
.
Please send comments to bottan2@illinois.edu or srmntbr2@illinois.edu↩