Detecting Seasonality Using Fourier Transforms in R

August 6, 2015 5 Comments detection, math

detect seasonality

Our brains are really fast at recognizing patterns and forms: we can often find the seasonality of a signal in under a second. It is also possible do this with mathematics using the Fourier transform.

First, we will explain what a Fourier transform is. Next, we will find the seasonality of a website from its Google Analytics pageview report using the R language.

Fourier Transforms Explained

The Fourier transform decomposes a signal into all the possible frequencies that comprise it. Fine, but how does it really work? (And keep it simple, please?)

 

1 – Pick a Frequency

First, the Fourier transform starts with the smallest frequency as possible. For a signal made of 100 points, the smallest frequency possible is 1/100 = 0.01 Hz. Think of a circle turning at a speed of 0.01 Hz,  or 0.01 second if the points are recorded every second. Just like a clock.

herz

1 circle turn = 0.01 s

 

2 – “Draw” the Signal Value on the Circle

Using the previously selected frequency we “draw” (decompose) the entire signal on the circle.

decompose

1 circle turn = 0.01 s

Notice: When the signal measurement is high, the clock arm of the circle is high.
The above signal will draw something like this:

draw

Signal decomposition at 0.01 Hz

 

3 – Compute the Periodogram

All the measurements are now on the circle in two dimensions. We call these “vectors”. Summing the vectors together give the final “power” of the frequency. This occurs when all the vectors line up and point in the same direction, creating high values that represent the “power” of the frequency.

4 – Repeat with Different Frequencies

To complete the Fourier transform we repeat the process with different frequencies: 0.02 Hz, 0.03 Hz … We continue until all the “power” of each possible frequency has been computed. The frequencies with the highest power represent the greatest periodicity!

Detecting Seasonality (Example)

We just saw that the frequency with the highest “power” represents the primary seasonality of our underlying metric. In this example, William Cox draws at high speed the signal value on the circle for many different frequencies.

Here is how the original signal value appears:
graph

To learn more, I recommend litening to William Cox talk about Fourier transforms:

In this example, a frequency of 2.1 gets a very high “power” value. This is where all the data line up and create a big vector.

From the frequency formula:
T = 1 / f
T = 1 / 2.1 Hz = 476 ms

The signal repeats every 476 ms. This is its seasonality!

Detecting Seasonality using R

My personal tech blog clearly shows some weekly trends: It receives much less traffic during the weekend. As a result, my Google Analytics report shows some sort of weekly periodicity. Let’s try to find the seasonality using the R language.

periodogram

The periodogram shows the “power” of each possible frequency, and we can clearly see spikes at around frequency 0.15 Hz

freq spec
0.142661180 167143.1
0.002743484 109146.8

Frequencies of 0.142661180 Hz and 0.002743484 Hz show seasonality!

> time
[1] 7.009615 364.500000

The main seasonality detected is 7 days. A secondary seasonality of 364.4 days was also found. Exactly what we expected! My blog has weekly seasonality as well as annual seasonality.

Monitor & detect anomalies with Anomaly.io

SIGN UP
  • Pingback: Extract Seasonal & Trend: using decomposition in R - Anomaly()

  • Naman Doshi

    I tried the TSA package, and according to the frequency my supposed seasonal period exceeds the size of my dataset provided, what does this imply?

    • naman bhalla

      Coincidence. I am Naman, and I had the same problem.

      • Abhishek joshi

        sir,
        i was trying to study the accelerograph data , i used the periodogram function but i am having a doubt. Does the periodogram function represent all the frequencies associated with the signal. my plot shows a frequency range of 0 to 0.5 Hz. But when i am plotting the evolutivefft it shows that the frequencies are present in the range of 0 t0 50Hz

  • Banti Laure Mathilde Yameogo

    Hi everyone,

    My data have 24 hours periodicity but my periodogram give me these result how to interpret that? my data are sampled every 15min. Sample size: 36411

    the result: freq spec 2 5.486968e-05 134.42945 1 2.743484e-05 69.86439

    time= 1/top2$f
    time
    1 18225 36450