Anomaly Detection with Twitter in R

April 21, 2015 2 Comments detection

anomaly detection at Twitter

Twitter has made an open source anomaly detection package in R. Its goal is to detect anomalies inAi??seasonal time series, as well as underlying trends.
Find the Anomaly Source Code on GitHub

Does it Really Detect Anomalies?

Yes! It actually works very well, as long as you use it for what it wasAi??created for. It wasAi??designed to detect global and local anomalies.

  • Global anomalies: These are the kind we are the most familiar with: anomalies that go out of theAi??usual range. While not alwaysAi??theAi??best way, using theAi??95 percentile technique can detect this kind of anomaly.
  • LocalAi??anomalies: Very often we can see an underlying trend in our data. ItAi??usually looks like a “wave”: low activity in the morning, high during the day, low again atAi??night. Local anomalies occurAi??within this context. For example: high activity atAi??night indicates an anomaly.

global anomaly detected

What anomalies can beAi??detected?

First, the software aims to detect global and local anomalies (see above). It is intended to understand “underlying trends” such as organic growth in the metrics. Twitter calls thisAi??algorithmAi??aAi??Seasonal Hybrid ESD (S-H-ESD).

I was very impressed by the Twitter anomaly detection system. It handled manyAi??different anomaly cases. Of course it didn’t detect everything: only what it was built for.

[Anomaly detected] Growth too early inAi??seasonal metrics

1-bumpToEarly-anomaly

[Anomaly detected] Some unusual noise

[Anomaly detected] More noise than usual

3-moreNoise-anomaly

[Anomaly detected] Breakdown

4-plateau-anomaly

[Anomaly detected] Sudden growth

5-growSuddenly-anomaly

[Anomaly detected] Sudden growth

6-floor-anomaly

[Anomaly detected] Pick

7-speark-anomaly

[Anomaly detected] Unusually high activity

8-bumpInDoublePick-anomaly

[Anomaly not detected] Linear growth

9-justGrow-no-anomaly

[Anomaly not detected] Linear seasonal growth

10-linearGrow-no-anomaly

What Can’tAi??be Detected?

Twitter Anomaly detection is impressive. but is built to detect certain kindsAi??of anomalies, not all of them!

[Anomaly not detected] Flat signal

2-flat-not-detected

[Anomaly not detected] No noise

1-removeNoise-not-detected

[Anomaly not detected] Exponential growth

3-exponentialGrow-not-detected

[Anomaly not detected] Negative seasonalAi??anomaly

4-linearGrowWithError-not-detected

[Anomaly not detected] Negative seasonalAi??anomaly

5-justGrowWithError-not-detected

Conclusion

Twitter made a bigAi??breakthrough in anomaly detection. Its model can detect a wide variety of anomalies.

There are only two drawbacks:
  • To my eyes, it only failedAi??to detectAi??one kind of anomaly: “negative seasonalAi??anomalies” (last graph above)
  • R is awesome, but notAi??suitable for anomaly detection in real time

Overall, however, it is incredible software.Ai??Congratulations Twitter, outstanding job!

Monitor & detect anomalies with Anomaly.io

SIGN UP