Modelling Customer Churn using Logistic RegressionCV

Go to Top

References

Some models can fit data for a range of values of some parameter almost as efficiently as fitting the estimator for a single value of the parameter. These models are:

linear_model.LogisticRegressionCV
             ------------------
RidgeCV RidgeClassifierCV ElasticNetCV
LarsCV LassoCV LassoLarsCV

Load the libraries

Go to Top

Colab

Useful Scripts

Go to Top

Load the Data

Go to Top

Data Processing

Go to Top

Oversampling: SMOTE

Go to Top

Scaling Numerical Features (Yeo-Johnson)

Modelling: LogisticRegressionCv

Go to Top

LogisticRegressionCV(
    *,
    Cs                = 10, # 10 means 10 logvalues between e-4 and e+4
    fit_intercept     = True,
    cv                = None,
    dual              = False,
    penalty           = 'l2',
    scoring           = None,
    solver            = 'lbfgs',
    tol               = 0.0001,
    max_iter          = 100,
    class_weight      = None,
    n_jobs            = None,
    verbose           = 0,
    refit             = True,
    intercept_scaling = 1.0,
    multi_class       = 'auto',
    random_state      = None,
    l1_ratios         = None, # only used for elasticnet with saga
)

solver{newton-cg, lbfgs, liblinear, sag, saga}, default=lbfgs
Algorithm to use in the optimization problem.

For small datasets, liblinear is a good choice, whereas sag and saga are faster for large ones.

For multiclass problems, only newton-cg, sag, saga and lbfgs handle multinomial loss; liblinear is limited to one-versus-rest schemes.

newton-cg, lbfgs and sag only handle L2 penalty, whereas liblinear and saga handle L1 penalty.

liblinear might be slower in LogisticRegressionCV because it does not handle warm-starting.

Model Evaluation

Go to Top

Time Taken

Go to Top