Table of Contents

Data Description

This dataset contains house sale prices for King County, which includes Seattle. It includes homes sold between May 2014 and May 2015.

Task: Try to estimate the price based on given features.

Imports

Important Scripts

Parameters

Load the data

Data Processing

Train target split

Modelling: Histogram Gradient Boosting

parameters
------------
early_stopping: 'auto' or bool (default=’auto’)
If ‘auto’, early stopping is enabled if the sample size is larger than 10000. If True, early stopping is enabled, otherwise early stopping is disabled.

scoring: str or callable or None, optional (default=’loss’)
Scoring parameter to use for early stopping. It can be a single string (see The scoring parameter: defining model evaluation rules) or a callable (see Defining your scoring strategy from metric functions). If None, the estimator’s default scorer is used. If scoring='loss', early stopping is checked w.r.t the loss value. Only used if early stopping is performed.

using pipeline

use early stopping

Cross Validation Results