Recently in pursuit of Open Source solution to monitor a stack of Jenkins and Team City nodes, I got chance to explore Prometheus and Grafana, which brought back all the fun, I used to have with CentOS and Open Source. That prompted me to write a step-by-step guide about setting up one.
Roughly 12 years back, I wrote a blog about Zabbix Monitoring when I was managing couple of data centres of a BPO in Delhi. In those days, I worked over a number of solutions over CentOS but later on in career, I got more engaged in Windows side of world. Microsoft Azure, O365 and more so Windows PowerShell and lost touch with such Open Source solutions for monitoring. But what better to resume the touch than Prometheus and Grafana.
Let’s go through introduction first, what is Prometheus? Some sci-fi movie title?
Prometheus is an open-source systems monitoring and alerting toolkit originally built at SoundCloud to meet requirements like a multi-dimensional data model, operational simplicity, scalable data collection, and a powerful query language, all in a single tool. The project was open-source from the beginning and began to be used by Boxever and Docker users as well, despite not being explicitly announced. Prometheus was inspired by the monitoring tool Borgmon used at Google. In May 2016, the Cloud Native Computing Foundation accepted Prometheus as its second incubated project, after Kubernetes. The blog post announcing this stated that the tool was in use at many companies including DigitalOcean, Ericsson, CoreOS, Weaveworks, Red Hat, and Google.
Second incubated project after Kubernetes? Impressive, no?
What about Grafana?
Grafana is a multi-platform open source analytics and interactive visualization web application. It provides charts, graphs, and alerts for the web when connected to supported data sources. A licensed Grafana Enterprise version with additional capabilities is also available as a self-hosted installation or an account on the Grafana Labs cloud service. It is expandable through a plug-in system. End users can create complex monitoring dashboards using interactive query builders. Grafana is divided into a front end and back end, written in TypeScript and Go, respectively.
As a visualization tool, Grafana is a popular component in monitoring stacks, often used in combination with time series databases such as InfluxDB, Prometheus and Graphite; monitoring platforms such as Sensu, Icinga, Checkmk, Zabbix, Netdata, and PRTG; SIEMs such as Elasticsearch and Splunk; and other data sources.
Ok enough of introduction, let’s begin with steps to setup one.
To start simply, we would need to disable SELINUX to avoid any troubles. In production, you might have different priorities but currently we are focusing on functionality only.
sudo vi /etc/sysconfig/selinux
and change SELINUX=enforcing to SELINUX=disabled
Reboot the machine (init 6).
Let’s prepare the directories by creating the service account and required folders
sudo useradd --no-create-home --shell /bin/false Prometheus
sudo mkdir /etc/Prometheus
sudo mkdir /var/lib/prometheus
sudo chown prometheus:prometheus /etc/prometheus
sudo chown prometheus:prometheus /var/lib/prometheus
Once the same is out of way, let’s download Prometheus package (you might need to change link as per version)
wget https://github.com/prometheus/prometheus/releases/download/v2.39.1/prometheus-2.39.1.linux-amd64.tar.gz
Once download finishes, let’s extract the same to a directory, copy the relevant files and setup permissions
tar -xvzf prometheus-2.39.1.linux-amd64.tar.gz
sudo cp prometheus-2.39.1.linux-amd64/prometheus /usr/local/bin
sudo cp prometheus-2.39.1.linux-amd64/promtool /usr/local/bin
sudo cp -r prometheus-2.39.1.linux-amd64/consoles /etc/Prometheus
sudo cp -r prometheus-2.39.1.linux-amd64/console_libraries /etc/Prometheus
sudo chown prometheus:prometheus /usr/local/bin/Prometheus
sudo chown prometheus:prometheus /usr/local/bin/promtool
sudo chown -R prometheus:prometheus /etc/prometheus/consoles
sudo chown -R prometheus:prometheus /etc/prometheus/console_libraries
Next step, create configuration
sudo vi /etc/Prometheus/prometheus.yml
Type the below in the file, note its .yaml means no TABS but spaces in pairs like 2,4,6,8….
global:
scrape_interval: 10s
scrape_configs:
- job_name: 'prometheus_master'
scrape_interval: 5s
static_configs:
- targets: ['Prometheus-server-ip:9090']
Though I have kept localhost and IP addresses but in your environment, you should align the same with DNS names preferably. We would more scrape_configs in the same format later on. Next is to set permissions on the file so that service can read the same.
chown prometheus:prometheus /etc/Prometheus/prometheus.yml
Oh wait! We need to create the service, let’s do the same
sudo vi /etc/systemd/system/prometheus.service
Type the below in the same
[Unit]
Description=Prometheus
Wants=network-online.target
After=network-online.target
[Service]
User=Prometheus
Group=Prometheus
Type=simple
ExecStart=/usr/local/bin/Prometheus \
--config.file /etc/Prometheus/prometheus.yml \
--storage.tsdb.path /var/lib/Prometheus/ \
--web.console.templates=/etc/Prometheus/consoles \
--web.console.libraries=/etc/Prometheus/console_libraries
[Install]
WantedBy=multi-user.target
You can add other optional lines (like –web. Enable-admin-api) in the service config.
Next part is to start the service. You need to do daemon-reload first as required after every change of service config.
sudo systemctl daemon-reload
sudo systemctl enable Prometheus
sudo systemctl start Prometheus
sudo systemctl status Prometheus –l
Wait! We missing something. You might have a firewall by default, which would block the ports which Prometheus would require. As of now, let’s add one rule and reload the firewall, we would add more later on. In your case, zone might have been something else as well.
sudo firewall-cmd --zone=public --add-port=9090/tcp --permanent
sudo systemctl reload firewalld
If all is well, then you should be able to access http://127.0.0.1:9090 locally and http://[your_server_ip]:9090 from outside. There are options about adding certificate as well to make it https and encrypt the communication between nodes and servers but as of now let’s keep it simple.
We are done with Prometheus server now, let’s start sending some data to it, which would need node_exporter to be installed on nodes, be it Windows or Linux. Let’s install one on the Prometheus node itself (can do on any linux node).
sudo useradd -rs /bin/false nodeusr
wget https://github.com/prometheus/node_exporter/releases/download/v1.4.0/node_exporter-1.4.0.linux-amd64.tar.gz
tar -xvzf node_exporter-1.4.0.linux-amd64.tar.gz
sudo mv node_exporter-1.4.0.linux-amd64/node_exporter /usr/local/bin
OK let’s setup the service now
sudo vi /etc/systemd/system/node_exporter.service
Type the below in the same
[Unit]
Description=Node Exporter
Wants=network-online.target
After=network-online.target
[Service]
User=nodeusr
Group=nodeusr
Type=simple
ExecStart=/usr/local/bin/node_exporter --collector.systemd
[Install]
WantedBy=multi-user.target
Systemd is additional collector here for collecting service state. You can add similar other non-default collectors same way. Let’s start the service now
systemctl daemon-reload
systemctl enable node_exporter
systemctl start node_exporter
systemctl status node_exporter –l
If all well and service is ok then add the port in firewall config
sudo firewall-cmd --zone=public --add-port=9100/tcp --permanent
sudo systemctl reload firewalld
Now check if it worked by going to http://[your-server-ip]:9100 . If it worked then http://[your-server-ip]:9100/metrics should show all kind of data coming from the server.
Now time to add this data to Prometheus.
sudo vi /etc/Prometheus/prometheus.yml
Add the below in the config (don’t miss two space before –job_name in line with earlier section)
- job_name: 'node_exporter_centos'
scrape_interval: 5s
static_configs:
- targets: ['node-exporter-ip:9100']
Now restart the service to load the config
sudo systemctl daemon-reload
sudo systemctl restart Prometheus
sudo systemctl status Prometheus –l
Did it work? Let’s check over http://[Prometheus-Server-IP]:9090/targets
If the server is listed there then proceed to check if data reaching there by going to http://[Prometheus-Server-IP]:9090/graph and check by typing queries.
Similarly to install on a windows node, head to GitHub – Prometheus-community/windows_exporter: Prometheus exporter for Windows machines and download the latest release msi.
msiexec /i C:\Users\Administrator\Downloads\windows_exporter.msi ENABLED_COLLECTORS="ad,iis,logon,memory,process,tcp,thermalzone" TEXTFILE_DIR="C:\custom_metrics\"
or similarly via exe file
.\windows_exporter.exe –collectors. Enabled "[defaults],process,container,thermalzone"
Once done then check for http://[windows_server_ip]:9182/metrics if the data is available there. Once all ok then proceed for setting up firewall rule on Prometheus server and then updating prometheus.yml
sudo firewall-cmd --zone=public --add-port=9182/tcp --permanent
sudo systemctl reload firewalld
sudo vi /etc/Prometheus/prometheus.yml
Type below in the same (maintaining yaml format, no tabs, spaces in pair, remember)
– job_name: 'windows_exporter'
scrape_interval: 5s
static_configs:
– targets: [windows_node_ip:9182]
Now time to reload Prometheus
sudo systemctl daemon-reload
sudo systemctl restart Prometheus
sudo systemctl status Prometheus -l
Till here, what we have done is to setup Prometheus server, one Linux Server (which can be Prometheus server itself or any other Linux server) and one Windows Server. Depending on your environment, you might need to open different ports in source OS local firewall or elsewhere. Port 9090, 9100, 9182 are the ones which are required so far, port 3000 would come for Grafana-Server and port 9091 if you go for pushgateway.
Let’s move ahead for the real beauty which is Grafana.
You can install Grafana server on the same server as Prometheus or can chose a different server as well. In my test setup, Prometheus server, Linux node and Grafana all were on one machine.
A number of ways, you can install Grafana.
You can add Grafana repo and install via yum like below:
sudo vi /etc/yum.repos.d/grafana.repo
and type below if looking for general release or can chose https://packages.grafana.com/oss/rpm-beta
[Grafana]
name=Grafana
baseurl=https://packages.grafana.com/oss/rpm
repo_gpgcheck=1
enabled=1
gpgcheck=1
gpgkey=https://packages.grafana.com/gpg.key
sslverify=1
sslcacert=/etc/pki/tls/certs/ca-bundle.crt
Once done then simple yum install
sudo yum install Grafana –y
Or you can install Grafana via rpm like below:
wget https://dl.grafana.com/oss/release/grafana-9.2.0-1.x86_64.rpm
sudo yum install grafana-9.2.0-1.x86_64.rpm
Once installed then service setup
sudo systemctl daemon-reload
systemctl enable Grafana-server
sudo systemctl start Grafana-server
sudo systemctl status Grafana-server –l
sudo firewall-cmd --zone=public --add-port=3000/tcp --permanent
sudo systemctl reload firewalld
Once done then Grafana should be available as http://[Grafana-Server-IP]:3000 , the default username/password is admin/admin, obviously, you are recommended to change the same immediately.
So we have systems ready now, next step would be to create awesome sci-fi type dashboards. You can go to https://grafana.com/grafana/dashboards/ for finding many community dashboards for different usages and sources but I would try to give you simple ones for Windows and Linux each. You may improvise as per your needs.
Here goes a simple one which I created for Linux node..
Would describe in next post about how to create these and what might be sample queries. Trust me like the process so far, even creating these would not be much hassles either.
Would love to hear your queries.
2 thoughts on “Monitoring IT Infra with Prometheus and Grafana – Part 1”