Our brain is really fast at recognizing patterns and forms. Few millisecond is often enough to find the seasonality of a signal. It is also possible with mathematics using the Fourier transform.
First, I will popularize what a Fourier transform is. Next, we will find the seasonality of a website from it Google Analytics pageview report using R language.
Fourier Transform explained
The Fourier transform decomposes a signal into all the possible frequencies that make it up. Fine, but how does it really works? (please keep is simple)
1 – Pick a frequency
First, the Fourier transform will start with the smallest frequency as possible. For a signal made of 100 points, the smallest frequency possible is 1/100 = 0.01 Hz. To popularize, think of a circle who turns at the speed of 0.01 Hz or 0.01 second if the points are recorded every second. Just like a clock.
1 circle turn = 0.01 s
2 – “Draw” the signal value on the circle
Using the previously selected frequency we “draw” (decompose) the entire signal on the circle.
1 circle turn = 0.01 s
Notice: When the signal measurement is high, the clock arm of the circle is high.
The above signal will draw something like this:
Signal decompose at 0.01 Hz
3 – Compute the Periodogram
All the measurement are now on the circle in two dimensions. We call them “vector”. Summing the vectors together give the final “power” of the frequency. This is when all the vectors line up and points in the same direction that the “power” of the frequency get high values.
4 – Repeat with different Frequencies
To complete the Fourier transform we repeat the process with different frequencies: 0.02Hz, 0.03Hz … Until all the “power” of each possible frequencies are computed. The highest the power is, the more periodicity is detected!
Detect Seasonality (example)
Previously, we find out that the frequency who get the highest “power” is the primary seasonality of our underlying metric. In this example, William Cox draws at high speed the signal value on the circle for many different frequencies.
Here is how the original signal value looks like:
To go further, I recommend William Cox talk about Fourier Transform:
In this example, frequency 2.1 get a very high “power” value. This append because all the data line up and create a big vector.
From the frequency formula:
T = 1 / f
T = 1 / 2.1 Hz = 476 ms
The signal repeats similarly every 476 ms. This is its seasonality!
Detect seasonality using R
My personal tech blog clearly show some weekly trends. It receives much less traffic during the weekend. As a result, Google Analytics report show some sort of weekly trend. Let’s try to find it seasonality using the R language.
# Install and import TSA package
# read the Google Analaytics PageView report
raw = read.csv("20131120-20151110-google-analytics.csv")
# compute the Fourier Transform
p = periodogram(raw$Visite)
The periodogram shows the “power” of each possible frequencies. It looks like there is some clear trend around frequency 0.15 Hz
dd = data.frame(freq=p$freq, spec=p$spec)
order = dd[order(-dd$spec),]
top2 = head(order, 2)
# display the 2 highest "power" frequencies
Frequencies of 0.142661180 Hz and 0.002743484 Hz show seasonality!
# convert frequency to time periods
time = 1/top2$f
 7.009615 364.500000
The main seasonality detected is 7 days. A second seasonality of 364.4 days also finds. Exactly what we expected! My blog follows a weekly seasonality as well as an annual seasonality.