Model Evaluation makes a model more reliable

Part I: Model Estimation

Part II: Model Diagnostics

Part III: Model Improvement

Part IV: Model Evaluation

In the previous post, we have estimated a model, and then diagnosed the problems of the model, and finally obtained the improved the model. In this post, we will investigate the performances of the model through model evaluation. Model evaluation denotes the general process to assess the performance of a model, whether it is good or bad to present the dateset, with different evaluation indices or metrics. In general, it includes the evaluation of the model fitting, model validation and model testing. In our example, or for a classic regression model, the model validation and model testing mean similar thing, which refer to prediction performance test of the model to new dataset, i.e. testing data.

### 1. Evaluation of Model Fitting

It is very handy to use `statsmodels.regression.linear_model.OLS.predict`

to display the model fitting values and model prediction values. Our final improved model was developed by dropping several regressors from the `X_train`

dataset, i.e. `X_train_drop`

dataset.

y_fit = res.predict(X_train_drop)

The OSL regression results have already showed us numerous statistic matrices, such as *R²*, ????????????. R². We can display them separately here.

print("R-squared: ",res.rsquared)

print("Adj. R-squared: ",res.rsquared_adj)

Besides, we use some other metrices, Mean Absolute Error(*MAE*), Mean Squared Error (*MSE*), Root Mean Squared Error(*RMSE*) and Mean Absolute Percentage Error(*MAPE*) to assess the model fitting and model prediction performances.

Statsmodels has several built-in functions to calculate most of these metrics in its `statsmodels.tools.eval_measures`

module. Unfortunately, we have to calculate `MAPE` ourselves because it is not included in Statsmodels.

from statsmodels.tools.eval_measures import meanabs,mse,rmse

import numpy as np

print("mean_absolute_error(MAE): ", meanabs(y_train, y_fit))

print("mean_squared_error(MSE)",mse(y_train, y_fit))

print("root_mean_squared_error(RMSE)",rmse(y_train, y_fit))

print ("mean_absolute_percentage_error(MAPE): ",np.mean((abs(y_train-y_fit))/y_train))

From all the above results, we can conclude that the model has a good performance to fit the training data.

### 2. Evaluation of Model Prediction

Let’s evaluate the model forecasting ability on testing dataset.

First, we predict the testing target, i.e. `y_test`

.

X_test_drop = X_test.drop(['pop','prov_gd','year','fexpen','uinc'],axis=1)

X_test_drop = sm.add_constant(X_test_drop)

y_pred = res.predict(X_test_drop)

Then, we calculate MAE, MSE, RMSE and MAPE of the testing.

print("mean_absolute_error(MAE): ", meanabs(y_test,y_pred))

print("mean_squared_error(MSE): ", mse(y_test,y_pred))

print("root_mean_squared_error(RMSE): ",rmse(y_test,y_pred))

print ("mean_absolute_percentage_error(MAPE): ",np.mean((abs(y_test-y_pred))/y_test))

The testing results above shows that the model has very good prediction performance. The *MAPE* is only 8.4%.

### 3. Comparing Prediction Results

Next, let’s display the statistic metrics in Pandas’ DataFrame.

actual_pred_compare = pd.DataFrame({'Acutal values':y_test,

'Predicted values':y_pred,

'Errors':y_test-y_pred,

'MAE':meanabs(y_test,y_pred),

'MSE':mse(y_test,y_pred),

'RMSE':rmse(y_test,y_pred),

'MAPE':np.mean((abs(y_test-y_pred))/y_test)})

actual_pred_compare

### 4. Model Results Visualization

Visualizing the model results can help use easily to assess how the model performance.

#### (1) Visualize the fitting results

In the statsmodels, it provides some beautiful plots. First, let’s plot gdp observation data, fitted value and residual (error) bar.

import matplotlib.pyplot as plt

fig, ax = plt.subplots(figsize=(12, 8))

fig = sm.graphics.plot_fit(res, 1, ax=ax) # or use the variable name, such as 'finv'

ax.set_ylabel('gdp')

ax.set_xlabel('finv')

ax.set_title("Linear Regression")

#### (2) Visualize the fitting and testing results

Now, we plot dependent variable vs. independent variables in terms of observations, fitting, and prediction values. Due to multiple independent variables, we can only plot one pairs per time in a 2D plotting system, suh as plotting ** gdp** vs.

**, for example as follows.**

*finv*

fig, axes = plt.subplots(1,2,figsize=(15,7))

axes[0].plot(X_train_drop.finv, y_train, "o", label="Train Data")

axes[0].plot(X_train_drop.finv, y_fit, "o", label="OLS Fitting Values")

axes[0].set_ylabel('gdp')

axes[0].set_xlabel('finv')

axes[0].legend(loc="best")

axes[1].plot(X_test_drop.finv, y_test,"go", label="Test Data")

axes[1].plot(X_test_drop.finv, y_pred,"ro", label="OLS Prediction")

axes[1].set_xlabel('finv')

axes[1].legend(loc="best")

plt.tight_layout()

In the third example, we plot ** finv** vs.

**+ Residual.**

*finv*

fig, ax = plt.subplots(figsize=(12, 8))

fig = sm.graphics.plot_ccpr(res, “finv”, ax=ax)