Anomaly Detection with Twitter in R

April 21, 2015 2 Comments detection

anomaly detection at Twitter

Twitter open source there anomaly detection package in R.
Its aim to detect anomalies in seasonality time series and underlying trends.
Find the Anomaly Source Code on GitHub

Does it really detect anomalies?

YES! It actually works very well. At least when you use it for what it was created for…
It was designed to detect global and local anomalies.

  • Global anomaly:
    It is the kind of anomaly we are the most familiar with. It’s an anomaly who goes out of the usual interval. It isn’t always the best way, but using the 95 percentile technique can detect this kind of anomaly.

local anomaly detected

  • Local anomaly
    Very often we can see an underlying trend into our data. It usually looks like a “wave”: low activity on the morning, high during the day, low at night. Local anomaly occur within this context. For example: high activity at night mean anomaly.

global anomaly detected

What anomaly can be detected?

First, it aims to detect global and local anomalies (see above).
It supposes to understand “underlying trends” such as an organic growth in the metrics.
Twitter call this algorithm a Seasonal Hybrid ESD (S-H-ESD).

I was very impressed by twitter anomaly detection. It spot many different anomaly case.
Of course it didn’t detect everything. Only what it was built for.

[Anomaly detected] Grow to early in seasonal metrics

1-bumpToEarly-anomaly

[Anomaly detected] Some unusual noise

2-moreNoise-anomaly

[Anomaly detected] More noise than usual

3-moreNoise-anomaly

[Anomaly detected] Break down

4-plateau-anomaly

[Anomaly detected] Sudden grow

5-growSuddenly-anomaly

[Anomaly detected] Sudden grow

6-floor-anomaly

[Anomaly detected] Pick

7-speark-anomaly

[Anomaly detected] Activity when usually none

8-bumpInDoublePick-anomaly

[Anomaly not detected] Linear grow

9-justGrow-no-anomaly

[Anomaly not detected] Linear seasonal grow

10-linearGrow-no-anomaly

What can’t be detected?

Twitter Anomaly detection is impressive. But it isn’t the only way to detect anomaly.
It is built to detect certain kinds of anomaly. Not all of them!

[Anomaly not detected] Flat signal

2-flat-not-detected

[Anomaly not detected] No noise

1-removeNoise-not-detected

[Anomaly not detected] Exponential grow

3-exponentialGrow-not-detected

[Anomaly not detected] Negative seasonal anomaly

4-linearGrowWithError-not-detected

[Anomaly not detected] Negative seasonal anomaly

5-justGrowWithError-not-detected

Conclusion

Twitter made a big breakthrough into anomaly detection.
It detects a wild type of anomalies.

Only two negative review:
  • To my eyes, it only failed to detect one kind of anomaly “Negative seasonal anomaly” (last graph above)
  • R is awesome. But not suitable for anomaly detection in real-time

Over all it is an incredible peace of software… Congrat’s Twitter, outstanding job !

Monitor & detect anomalies with Anomaly.io

SIGN UP