Anomaly » influxdb

Salesforce Anomaly Detection Using Anomaly.io

martin-magakian — Mon, 30 Jan 2017 14:42:52 +0000

Monitoring Key Performance Indicators (KPIs) is essential to running a successful business. As one example, you should examine your lead generation KPI several times a day, to allow you to detect and correct problems as quickly as possible.

But you’re busy—you don’t have time to watch KPI indicators all day long. That’s where Anomaly.io comes in. By combining our detection algorithms with your Salesforce data, you can automatically detect problems and notify the appropriate personnel to ensure that speedy corrective action is taken.

From Salesforce to InfluxDB (Anomaly.io Protocol)

Let’s see how we can forward Salesforce data to Anomaly.io for automatic anomaly detection.

First, install the StreamSets Data Collector, which can easily subscribe to and stream Salesforce events. Next, log into your fresh StreamSets install with the default login/password combination “admin/admin”.
Now install the packages “Salesforce Library” and “InfluxDB 0.9+” from the PackageManager:

Create a new Pipeline. Click the “Stage Library” (top right icon), select the origin “Salesforce”, then the processor “Groovy Evaluator”, and finally, the destination “InfluxDB”.

Now link the nodes together:

Salesforce Node

This node collects historical “Opportunities” and subscribes to upcoming ones through the Salesforces Streaming API

Enter this data on the Salesforce tab:

Username: salesforceUserName
Password: MyPassword+securityToken
e.g: MyPassword1FPeZD9z[…]
Reset your salesforce security token
API version: 36.0
Query Existing Data: Yes
Subscribe for Notifications: Yes

IMPORTANT: Be sure to enter “36.0” for the API, as currently this is the only compatible version.

On the Query tab:

Use Bulk API: Yes
SOQL Query:
SELECT Id, AccountId, CreatedDate, CloseDate, Amount
FROM Opportunity WHERE Id > ‘${OFFSET}’ ORDER BY Id

And finally, on the Subscribe tab:

Push Topic: OpportunityCreate

Groovy Node

This node transforms Salesforce output into InfluxDB input. You need to do the following:

Set the required AccountId, if it is missing.
Translate Salesforce time into InfluxDB time.
Set a “measurement” to save the events into.

Example code for the Groovy tab:

for (record in records) {
 try {
  record.value['measurement'] = 'opportunity'
  Date date = Date.parse("yyyy-MM-dd'T'HH:mm:ss.SSS'Z'", record.value['CreatedDate'])
  record.value['CreatedDate'] = date.getTime()
  if(record.value['AccountId'] == null){
   record.value['AccountId'] = "unknow"
  }
  output.write(record)
 } catch (e) {
  log.error(e.toString(), e)
  error.write(record, e.toString())
 }
}

InfluxDB Node

This node sends the data to Anomaly.io, which uses the InfluxDB protocol, for automatic anomaly detection.

Enter the following on the InfluxDB tab:

URL: http://events.anomaly.io:8086
User: yourAnomalyUser
Password: yourAnomalyPasswd
Database Name: salesforce
Auto-create Database: Yes
Record Mapping: Custom Mappings
Measurement Field: /measurement
Tag Fields: /AccountId
Value Fields: /Amount /CreatedDate /CloseDate

Register Salesforce PushTopic

In our StreamSets configuration, the Salesforce origin will request all historical data, then subscribe to a channel we called “OpportunityCreate“. To set this up, we need to create this PushTopic in Salesforce:

Login to Salesforce
Open the Developer Console.
Select Debug | Open Execute Anonymous Window.
In the Enter Apex Code window, paste in the following Apex code, then click Execute.

PushTopic pushTopic = new PushTopic();
pushTopic.Name = 'OpportunityCreate';
pushTopic.Query = 'SELECT Id, AccountId, CreatedDate, CloseDate, Amount FROM Opportunity';
pushTopic.ApiVersion = 36.0;
pushTopic.NotifyForOperationCreate = true;
pushTopic.NotifyForOperationUpdate = false;
pushTopic.NotifyForOperationUndelete = false;
pushTopic.NotifyForOperationDelete = false;
pushTopic.NotifyForFields = 'Referenced';
insert pushTopic;

Run Streaming from Salesforce to InfluxDB (or Anomaly.io)

Click outside any node, then go to Configuration > Error Records and select “(Library: Basic)”. Now we can finally run the pipeline:

You should see something like the following. Note that here a total of 32 “Opportunities” were processed, and StreamSets is waiting for new events to come in:

Salesforce historical data has now been imported into InfluxDB. What about future “Opportunities”? Let’s check to make sure the Salesforce Streaming API works:

Login to workbench.developerforce.com
Go to Data > Insert | Object Type “Opportunity” | Next
Enter:
AccountId: 0010Y00000DX4EgQAL
Amount: 100
CloseDate: 2019-05-27T10:43:35.000Z
Name: My Opportunity
StageName: Open

Data should be processed within one second and then sent to InfluxDB or Anomaly.io.

Detect Anomalies in Salesforce

Using Anomaly.io

You can spend a lot of money on R&D, or you can save that money and use anomaly.io to detect exceptions and issues in your data. After a week of intense machine learning, our algorithm will pick the best anomaly detection method for your data. Our system can automatically apply techniques such as k-mean clustering, anomaly correlation, breakout detection and trend decomposition in real time, along with others.

Simply forward your data to anomaly.io; we’ll keep an eye on it.

Using Custom Anomaly Detection

Anomaly detection is complex, and especially difficult when you need to work with unknown problems. Your users and business will evolve, so you can’t rely on just setting simple thresholds.

Before you start, always check to make sure that your data is compatible with the algorithms you are using. For example, a common error is to use an algorithm intended for normally distributed time series when your data isn’t normally distributed.

You can also forward your Salesforce data into an open source solution such as Twitter Anomaly Detection, Twitter Breakout, or Skyline, or develop your own.

The post Salesforce Anomaly Detection Using Anomaly.io appeared first on Anomaly.

InfluxDB Configuration Example with CollectD

martin-magakian — Tue, 14 Apr 2015 15:23:28 +0000

Note that this is just an InfluxDB configuration example. Change it to suit your own needs.

Run InfluxDB with its configuration file:

influxd -config="config.toml"

With this configuration InfluxDB will:

Not require any login/password to read and write to the database
Listen to commands on port 8066
Listen to CollectD metrics on port 25826
Save CollectD metrics in “collectd_db” (InfluxDB database must exist)
Run the admin web interface on port 8083
Record on disk in the /var/opt/influxdb/ directory

# Welcome to the InfluxDB configuration file.

# If hostname (on the OS) doesn't return a name that can be resolved by the other
# systems in the cluster, you'll have to set the hostname to an IP or something
# that can be resolved here.
# hostname = ""
bind-address = "0.0.0.0"

# The default cluster and API port
port = 8086

# Once every 24 hours InfluxDB will report anonymous data to m.influxdb.com
# The data includes raft id (random 8 bytes), os, arch and version
# We don't track ip addresses of servers reporting. This is only used
# to track the number of instances running and the versions, which
# is very helpful for us.
# Change this option to true to disable reporting.
reporting-disabled = false

# Controls settings for initial start-up. Once a node is successfully started,
# these settings are ignored.  If a node is started with the -join flag,
# these settings are ignored.
[initialization]
join-urls = "" # Comma-delimited URLs, in the form http://host:port, for joining another cluster.

# Control authentication
# If not set authetication is DISABLED. Be sure to explicitly set this flag to
# true if you want authentication.
[authentication]
enabled = false

# Configure the admin server
[admin]
enabled = true
port = 8083

# Configure the HTTP API endpoint. All time-series data and queries uses this endpoint.
[api]
# ssl-port = 8087    # SSL support is enabled if you set a port and cert
# ssl-cert = "/path/to/cert.pem"

# Configure the Graphite plugins.
[[graphite]] # 1 or more of these sections may be present.
enabled = false
# protocol = "" # Set to "tcp" or "udp"
# address = "0.0.0.0" # If not set, is actually set to bind-address.
# port = 2003
# name-position = "last"
# name-separator = "-"
# database = ""  # store graphite data in this database

# Configure the collectd input.
[collectd]
enabled = true
#address = "0.0.0.0" # If not set, is actually set to bind-address.
port = 25826
database = "collectd_db"
typesdb = "/opt/collectd/share/collectd/types.db"

# Configure the OpenTSDB input.
[opentsdb]
enabled = false
#address = "0.0.0.0" # If not set, is actually set to bind-address.
#port = 4242
#database = "opentsdb_database"

# Configure UDP listener for series data.
[udp]
enabled = false
#bind-address = "0.0.0.0"
#port = 4444

# Broker configuration. Brokers are nodes which participate in distributed
# consensus.
[broker]
enabled = true
# Where the Raft logs are stored. The user running InfluxDB will need read/write access.
dir  = "/var/opt/influxdb/raft"

# Data node configuration. Data nodes are where the time-series data, in the form of
# shards, is stored.
[data]
enabled = true
dir = "/var/opt/influxdb/db"

# Auto-create a retention policy when a database is created. Defaults to true.
retention-auto-create = true

# Control whether retention policies are enforced and how long the system waits between
# enforcing those policies.
retention-check-enabled = true
retention-check-period = "10m"

# Configuration for snapshot endpoint.
[snapshot]
enabled = true # Enabled by default if not set.
bind-address = "127.0.0.1"
port = 8087

[logging]
write-tracing = false # If true, enables detailed logging of the write system.
raft-tracing = false # If true, enables detailed logging of Raft consensus.

# InfluxDB can store statistical and diagnostic information about itself. This is useful for
# monitoring purposes. This feature is disabled by default, but if enabled, these data can be
# queried like any other data.
[monitoring]
enabled = false
write-interval = "1m"          # Period between writing the data.

The post InfluxDB Configuration Example with CollectD appeared first on Anomaly.

Using Grafana v2 with InfluxDB V0.9

martin-magakian — Tue, 14 Apr 2015 11:30:54 +0000

InfluxDB is often used with Grafana for dashboard monitoring. Grafana is a powerful metrics dashboard and graph editor. When Grafana is used with InfluxDB you can watch your metrics in real time.

Requirements

InfluxDB must be up and running, with some metrics recorded
(Optional) At least some CollectdD metrics have been sent to InfluxDB

Install and Start Grafana

On Debian / Ubuntu:

cd /tmp/
wget https://grafanarel.s3.amazonaws.com/builds/grafana_2.0.2_amd64.deb
apt-get install -y adduser libfontconfig
dpkg -i grafana_2.0.2_amd64.deb

systemctl start grafana-server #start grafana

On CentOS / Fedora (RedHat):

yum install https://grafanarel.s3.amazonaws.com/builds/grafana-2.0.0_beta3-1.x86_64.rpm

systemctl start grafana-server #start grafana

Grafana will start automatically at boot time:

Environment variables are located in /etc/default/grafana-server
Configuration file can be found in /etc/grafana/grafana.ini

By default, Grafana UI will start on port 3000. Open a web browser to http://myserver.com:3000 (default password: admin / admin )

Setting InfluxDB as Grafana’s Data Source

Let’s set up Grafana and add InfluxDB as a data source for live monitoring. Since a picture is worth a thousand words, follow these graphical steps:

1- Click on the Menu

2- Select a New Data Source

3- Add InfluxDB as Grafana Data Source

4- Create a New Dashboard

5- Add a New Graph

6- Edit the Default Graph

7- Configure the Graph

The post Using Grafana v2 with InfluxDB V0.9 appeared first on Anomaly.

How to Compile InfluxDB from Source

martin-magakian — Thu, 09 Apr 2015 11:05:34 +0000

InfluxDB has been written 100% in Go since version 0.9. This means it’s super easy and super fast to compile it. With a few commands, you can build your own InfluxDB database.

Install Golang

To install the Go compiler you will need a C compiler:

#debian / ubuntu
apt-get install build-essential git

#centos / fedora / redhat
yum install -y make automake gcc gcc-c++ kernel-devel git

#mac os x
install xcode + git

Edit your ~/.bashrc (on Linux) or ~/.bash_profile (on Mac OS X) and add:

export GOPATH=$HOME/go
export GOBIN=$GOPATH/bin
export PATH=$PATH:$GOBIN

With those settings, Go repositories will be downloaded in $HOME/go/src
The Go build will be installed in $HOME/go/bin

Download, compile and install Go:

cd /tmp/
wget https://storage.googleapis.com/golang/go1.7.src.tar.gz
tar -xvf go1.7.src.tar.gz
cd go/src
./all.bash #install Golang
hash -r    #refresh PATH

Compile the Latest InfluxDB

go get github.com/influxdata/influxdb
cd $GOPATH/src/github.com/influxdata/
go get ./...
go install ./...

Start InfluxDB

InfluxDB binaries will be located in $HOME/go/bin/influxd and $HOME/go/bin/influx
As it is already in your PATH, you can start InfluxDB with influxd command

influxd

 8888888           .d888 888                   8888888b.  888888b.
   888            d88P"  888                   888  "Y88b 888  "88b
   888            888    888                   888    888 888  .88P
   888   88888b.  888888 888 888  888 888  888 888    888 8888888K.
   888   888 "88b 888    888 888  888  Y8bd8P' 888    888 888  "Y88b
   888   888  888 888    888 888  888   X88K   888    888 888    888
   888   888  888 888    888 Y88b 888 .d8""8b. 888  .d88P 888   d88P
 8888888 888  888 888    888  "Y88888 888  888 8888888P"  8888888P"

[run] 2016/08/25 18:25:09 InfluxDB starting, version unknown, branch unknown, commit unknown
[run] 2016/08/25 18:25:09 Go version go1.7, GOMAXPROCS set to 2
[run] 2016/08/25 18:25:09 no configuration provided, using default settings
[...]

You can now open the web admin at http://localhost:8083

Go further

You should run InfluxDB with a configuration file.
Usually InfluxDB is used with Grafana to permit dashboard monitoring.

The post How to Compile InfluxDB from Source appeared first on Anomaly.