Table of Contents

Data Description

Reference: https://www.kaggle.com/c/web-traffic-time-series-forecasting/data

I have cleaned the kaggle wikipedia traffic data and selected only data of 2016 with fraction of 0.1.

The data was melted and additional columns were created.

Load the data

Data Processing

category distribution

Exploratory Data Analysis (EDA)

Most visited page

Top 5 pages per language

Data Visualizations

Single Timeseries Visualization

Language monthly mean

Timeseries per language

Page Visits per Week Day

Page Visits per Month Day

Fast Fourier Transform (FFT)

Resources:

Here we can that the plots seems periodic in nature in time domain. We can work in the frequency domain using FFT transformation of the time series. Peaks in the FFT show us the strongest frequencies in the periodic signal.

The Fourier transform is an alternative representation of a signal as a superposition of periodic components. It is an important mathematical result that any well-behaved function can be represented under this form. Whereas a time-varying signal is most naturally considered as a function of time, the Fourier transform represents it as a function of the frequency. A magnitude and a phase, which are both encoded in a single complex number, are associated to each frequency.

The Discrete Fourier Transform
Let's consider a digital signal x represented by a vector $(x0,...,xN−1)$. We assume that this signal is regularly sampled. The Discrete Fourier Transform (DFT) of x is $X=(X0,...,XN−1)$ defined as:

$$ \forall k \in\{0, \ldots, N-1\}, \quad X_{k}=\sum_{n=0}^{N-1} x_{n} e^{-2\ i \pi k\ n / N} $$

The DFT can be computed efficiently with the Fast Fourier Transform (FFT), an algorithm that exploits symmetries and redundancies in this definition to considerably speed up the computation. The complexity of the FFT is $O(NlogN)$ instead of $O(N^2)$ for the naive DFT. The FFT is one of the most important algorithms of the digital universe.