Saturday 17 September 2016

Open source monitoring with Prometheus & grafana

Prometheus is an open source monitoring & system statistics gathering tool written in GO. It can gather statistics over a wide range of metrics & allows the users to plot graphs based on selected metrics. Prometheus can be downloaded from packages available on github. Once downloaded, you need to extract the file prometheus-1.1.2.linux-amd64.tar.gz.

[root@centops ~]# cd prometheus-1.1.2.linux-amd64
[root@centops prometheus-1.1.2.linux-amd64]# ls
console_libraries  consoles  data  LICENSE   NOTICE  prometheus  prometheus.yml  prometheus.yml.bkp  promtool
[root@centops prometheus-1.1.2.linux-amd64]#

We need to install to the node exporter which is the Prometheus exporter for machine metrics & has collectors for gathering data over a wide range of machine metrics. The package node_exporter-0.12.0rc3.linux-amd64.tar.gz can be downloaded from github. Once downloaded start the node exporter by typing:

[root@centops prometheus-1.1.2.linux-amd64]# ./node_exporter
INFO[0000] No directory specified, see --collector.textfile.directory  source=textfile.go:57
INFO[0000] Enabled collectors:                           source=node_exporter.go:146
INFO[0000]  - stat                                       source=node_exporter.go:148
INFO[0000]  - textfile                                   source=node_exporter.go:148
INFO[0000]  - uname                                      source=node_exporter.go:148
INFO[0000]  - conntrack                                  source=node_exporter.go:148
INFO[0000]  - entropy                                    source=node_exporter.go:148
INFO[0000]  - filefd                                     source=node_exporter.go:148
INFO[0000]  - mdadm                                      source=node_exporter.go:148
INFO[0000]  - netstat                                    source=node_exporter.go:148
INFO[0000]  - vmstat                                     source=node_exporter.go:148
INFO[0000]  - diskstats                                  source=node_exporter.go:148
INFO[0000]  - time                                       source=node_exporter.go:148
INFO[0000]  - version                                    source=node_exporter.go:148
INFO[0000]  - filesystem                                 source=node_exporter.go:148
INFO[0000]  - meminfo                                    source=node_exporter.go:148
INFO[0000]  - netdev                                     source=node_exporter.go:148
INFO[0000]  - loadavg                                    source=node_exporter.go:148
INFO[0000]  - sockstat                                   source=node_exporter.go:148
INFO[0000] Starting node_exporter v0.12.0rc3 at :9100    source=node_exporter.go:167


Before we start Prometheus server we need to update its configuration file prometheus.yml as follows:

# my global config
global:
  scrape_interval:     15s # By default, scrape targets every 15 seconds.
  evaluation_interval: 15s # By default, scrape targets every 15 seconds.
  # scrape_timeout is set to the global default (10s).

  # Attach these labels to any time series or alerts when communicating with
  # external systems (federation, remote storage, Alertmanager).
  external_labels:
      monitor: 'codelab-monitor'

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first.rules"
  # - "second.rules"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: "node"

    # Override the global default and scrape targets from this job every 5 seconds.
    scrape_interval: 5s

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
      - targets: ['192.168.44.138:9100']

To start Prometheus type:

[root@centops prometheus-1.1.2.linux-amd64]# ./prometheus
INFO[0000] Starting prometheus (version=1.1.2, branch=master, revision=36fbdcc30fd13ad796381dc934742c559feeb1b5)  source=main.go:73
INFO[0000] Build context (go=go1.6.3, user=root@a74d279a0d22, date=20160908-13:12:43)  source=main.go:74
INFO[0000] Loading configuration file prometheus.yml     source=main.go:221
INFO[0000] Loading series map and head chunks...         source=storage.go:358
INFO[0000] 1445 series loaded.                           source=storage.go:363
WARN[0000] No AlertManagers configured, not dispatching any alerts  source=notifier.go:176
INFO[0000] Starting target manager...                    source=targetmanager.go:76
INFO[0000] Listening on :9090                            source=web.go:233

The Prometheus web interface would now be accessible via the URL http://<IP>:9090

To make sure that the Prometheus server is fetching the data from the Node Exporter, click on the Graph and insert any metric chosen from the drop down, then click on the ‘Execute’ button to see the graph as shown below.




Integrating Prometheus with grafana:

Gafana is a graphing & dashboard tool which can collect data from various sources & be a front end for displaying the collected data in beautiful graphs & dashboards.

To add Prometheus as a data source for grafana do the following:
  1. Go to http://<IP>:3000
  2. Enter the username admin and password admin, and then click “Log In”.
  3. Click “Data Sources” on the left menu
  4. Click “Add new” on the top menu
  5. Add a default data source of type Prometheus with http://<IP>:9090 as the URL


Adding a Dashboard:

  1. Click “Dashboards” on the left menu
  2. Click “Home” on the top menu, and then “+ New” at the bottom of the panel that appears
  3. Click the green row tab, and then “Add Panel” and “Graph”
  4. Click on “no title (click here)” and then “Edit”
  5. Enter node_load1{job='node'} in the “Query” field. Here job will be the job you specified in in the prometheus.yml file preceeded by one of the metrics.
  6. Click the floppy disk icon on the top menu to save your dashboard



No comments:

Post a Comment

Using capture groups in grep in Linux

Introduction Let me start by saying that this article isn't about capture groups in grep per se. What we are going to do here with gr...