Table of Contents

Data Description

Data source: https://www.kaggle.com/harlfoxem/housesalesprediction

Load the libraries

Useful Functions

Parameters

Load the Data

Pandas Profiling

This takes long time, run this using script.

Sweetviz

This takes long time, use separate script to run it once.

Modelling: Random Forest

Yellowbrick Visualization

Prediction Error vs Truth

Random Forest Confidence Interval

References:

Model Explanation Using Lime

Model Intrepretation using ELI5

Feature Importances

ELI5's Permutation Importance on the same features

Feature importance as a box plot

Weights of a tree in a small forest

sklearn Random Forest plot tree using graphviz