Table of Contents

Data Description

This dataset contains house sale prices for King County, which includes Seattle. It includes homes sold between May 2014 and May 2015.

Task: Try to estimate the price based on given features.

Imports

Important Scripts

Parameters

Load the data

Data Processing

Log transform Target

Train-target split

Train-Validation Split

Scaling

Modelling: catboost

class CatBoostRegressor(
iterations=None,
learning_rate=None,
loss_function='RMSE',
use_best_model=None,
verbose=None,
silent=None,
logging_level=None,
one_hot_max_size=None,
ignored_features=None,
train_dir=None,
custom_metric=None,
eval_metric=None,
subsample=None,
max_depth=None,
n_estimators=None,
num_boost_round=None,
num_trees=None,
colsample_bylevel=None,
random_state=None,
reg_lambda=None,
objective=None,
eta=None,
max_bin=None,
early_stopping_rounds=None,
cat_features=None,
min_child_samples=None,
max_leaves=None,
num_leaves=None,
score_function=None,
)

Best model

Feature Statistics

Feature Importance

Metric Visualizer (only works in notebook, not jupyterlab or colab)

Eval Metrics (needs Pool data)

Model Persistence

Remove Cache