Table of Contents

Description

This is a transnational data set which contains all the transactions occurring between 01/12/2010 and 09/12/2011 for a UK-based and registered non-store online retail.The company mainly sells unique all-occasion gifts. Many customers of the company are wholesalers.

Feature Description
InvoiceNo Invoice number. Nominal, a 6-digit integral number uniquely assigned to each transaction. If this code starts with letter 'c', it indicates a cancellation.
StockCode Product (item) code. Nominal, a 5-digit integral number uniquely assigned to each distinct product.
Description Product (item) name. Nominal.
Quantity The quantities of each product (item) per transaction. Numeric.
InvoiceDate Invice Date and time. Numeric, the day and time when each transaction was generated.
UnitPrice Unit price. Numeric, Product price per unit in sterling.
CustomerID Customer number. Nominal, a 5-digit integral number uniquely assigned to each customer.
Country Country name. Nominal, the name of the country where each customer resides.

Load the libraries

Read the clean data

Method 01: Aggregate Model

The General Formula for calculating CLV is:


CLV = ((Average Sales X Purchase Frequency) / Churn) X Profit Margin

Where,
Average Sales = TotalSales/Total no. of orders
Purchase Frequency = Total no. of orders/Total unique customers
Retention rate = Total no. of orders greater than 1/ Total unique customers
Churn = 1 - Retention rate
Profit Margin = Based on business context (take 5% if not given)

aggregate the data per customer

calculate CLV variables

Method 02: Cohort Model

Method 03: BG/NBD Model + Gamma-Gamma Model

The BG/NBD modelling is one of the most used models as like Pareto/NBD model.

These methods try to predict the future transactions of each customers.

To calculate the monetary value we use gamma-gamma model.

Assumptions of he BG/NBD model:

  1. When a user is active, number of transactions in a time t is described by Poisson distribution with rate lambda.
  2. Heterogeneity in transaction across users (difference in purchasing behavior across users) has Gamma distribution with shape parameter r and scale parameter a.
  3. Users may become inactive after any transaction with probability p and their dropout point is distributed between purchases with Geometric distribution.
  4. Heterogeneity in dropout probability has Beta distribution with the two shape parameters alpha and beta.
  5. Transaction rate and dropout probability vary independently across users.

RFM Analysis:

frequency - the number of repeat purchases (more than 1 purchases)
recency - the time between the first and the last transaction
T - the time between the first purchase and the end of the transaction period
monetary_value - it is the mean of a given customers sales value

Get RFM summary data

BG/NBD Fitting to get expected number of purchases

Customer alive probability

Calculate expected number of purchase upto time

Gamma-Gamma Modelling for Monetary Value

assumptions of Gamma-Gamma model are:

  1. The monetary value of a customer's given transaction varies randomly around their average transaction value.
  2. Average transaction value varies across customers but do not vary over time for any given customer.
  3. The distribution of average transaction values across customers is independent of the transaction process.

NOTE: We are considering only customers who made repeat purchases with the business i.e., frequency > 0. Because, if frequency is 0, it means that they are one time customer and are considered already dead.

Fitting GammaGamma Model

Predict cutomer life time value using ggf

Time Taken