Proxmox Hypervisor Monitoring with Telegraf and InfluxDB

Published: 2021-05-05, Revised: 2022-02-21


hifi


TL;DR This is a description of the process to install Telegraf on proxmox to collect sensor reading, smart data and metrics in InfluxDB 2.0.


Motivation Critical infrastructure needs monitoring. For the proxmox hypervisor, I wanted to monitor:

InfluxDB is well suited for this purpose. It can be directly connected to Grafana. The proxmox interface already offers the option to connect to a metric server such as InfluxDB. However, it will only send standard metrics that are available in the dashboard.

To include Smart monitoring and sensor readings, Telegraf must be installed on the proxmox host.

There are some instructions available how to do this, but I found no source that covers all required steps.

This post covers:

Not included here is the setup of InfluxDB 2.0 itself. I have installed it in a separate LXC container running debian, based on the default instructions from the docs.

Also, I run InfluxDB 2.0 behind a Nginx reverse proxy, which makes the interface available through HTTPS with Let's Encrypt SSL certs in a local subdomain. The instructions below are the same, regardless of whether InfluxDB is available through an IP or a domain name.

For the sake of completeness, see my nginx config for InfluxDB 2.0 below
server {
        listen 443 ssl http2;
        listen [::]:443 ssl http2;
        server_name influx.local.mytld.com;

        ssl      on;
        ssl_certificate     /etc/nginx/ssl/wildcard.local.mytld.com.fullchain;
        ssl_certificate_key /etc/nginx/ssl/wildcard.local.mytld.com.key;

        ssl_protocols           TLSv1.2 TLSv1.3;
        ssl_ciphers HIGH:!MEDIUM:!LOW:!aNULL:!NULL:!SHA;
        ssl_prefer_server_ciphers on;
        ssl_session_cache shared:SSL:10m;

        location / {
                proxy_pass http://localhost:8086;
                proxy_redirect off;
                proxy_http_version 1.1;
                proxy_max_temp_file_size 10m;
                proxy_connect_timeout 20;
                proxy_send_timeout 20;
                proxy_read_timeout 20;
                proxy_set_header Host $host;
                proxy_set_header Upgrade $http_upgrade;
                proxy_set_header Connection keep-alive;
                proxy_set_header X-Real-IP $remote_addr;
                proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
                proxy_set_header X-Forwarded-Proto http;
                proxy_set_header X-Original-For $proxy_add_x_forwarded_for;
                proxy_set_header X-Original-Proto https;
                proxy_cache_bypass $http_upgrade;
        }

}

When I go to https://influx.local.mytld.com, the InfluxDB 2.0 frontend opens.

Note that all metric collectors must be configured to use port 443 instead of 8086, and they must also have the current SSL certs available.


Redirect metric collection#

The first step is to redirect Proxmox metric collection to a local Socket that can be consumed by Telegraf.

The setting file can be found at:

/etc/pve/status.cfg

If you have anything set in the proxmox web interface, under, Datacenter > Metric Server, it will be stored in this file.

Edit the file (e.g. nano /etc/pve/status.cfg) and replace with the following lines:

influxdb: InfluxDB
   server 127.0.0.1
   port 8089

You can select any name here for the metric server, I used InfluxDB.

Using these settings, Proxmox will send metrics internally to port 8089 on localhost, which we will connect to from Telegraf in the next step.

Install Telegraf#

I do not like making modifications to the Proxmox host for several reasons, but this is unavoidable* if you want to directly collect smart data and sensor readings. Below is from the Telegraf Docs (check for changes first).

wget -qO- https://repos.influxdata.com/influxdb.key | sudo tee /etc/apt/trusted.gpg.d/influxdb.asc >/dev/null
source /etc/os-release
echo "deb https://repos.influxdata.com/${ID} ${VERSION_CODENAME} stable" | sudo tee /etc/apt/sources.list.d/influxdb.list
sudo apt-get update && sudo apt-get install telegraf

* Not exactly. You could use Pci passthrough to forward all sensors to a VM. This would be the cleanest approach, but also the most laborious.

Configure telegraf plugins#

A sample telegraf.conf is available that contains all plugins.

Make a backup and create a new, empty telegraf.conf.

cp /etc/telegraf/telegraf.conf /etc/telegraf/telegraf.conf.bak
rm /etc/telegraf/telegraf.conf
nano /etc/telegraf/telegraf.conf

Use the following configuration settings as a template.

[agent]
  interval = "10s"
  round_interval = true
  metric_batch_size = 1000
  metric_buffer_limit = 10000
  collection_jitter = "0s"
  flush_interval = "10s"
  flush_jitter = "0s"
  precision = ""
  hostname = ""
  omit_hostname = false

# Configuration for sending metrics to InfluxDB
[[outputs.influxdb_v2]]
  urls = ["https://influx.local.tld.com"]
  token = "your influxdb-2.0-token"
  organization = "your business name"
  bucket = "your_bucket"

# Gather metrics from proxmox based on what is in /etc/pve/setup.cfg
[[inputs.socket_listener]]
  service_address = "udp://:8089"

[[inputs.smart]]
    ## Optionally specify the path to the smartctl executable
    path_smartctl = "/usr/sbin/smartctl"
    path_nvme = "/usr/sbin/nvme"
    use_sudo = true
    devices = [ 
        "/dev/bus/0 -d megaraid,8",
        "/dev/bus/0 -d megaraid,9",
        "/dev/bus/0 -d megaraid,10",
        "/dev/bus/0 -d megaraid,11"]

[[inputs.sensors]]
    ## Remove numbers from field names.
    ## If true, a field name like 'temp1_input' will be changed to 'temp_input'.
    # remove_numbers = true

    ## Timeout is the maximum amount of time that the sensors command can run.
    # timeout = "5s"    

[[inputs.apcupsd]]
  # A list of running apcupsd server to connect to.
  # If not provided will default to tcp://127.0.0.1:3551
  servers = ["tcp://127.0.0.1:3551"]

  ## Timeout for dialing server.
  timeout = "5s"

[[outputs.influxdb_v2]]#

In InfluxDB 2.0, add a bucket and organization. Create a token. This will be used to by Telegraf to authenticate and write metrics.

Replace ["https://influx.local.tld.com"] with your InfluxDB 2.0 domain or IP/Port.

[[inputs.socket_listener]]#

This is the metrics socket that Telegraf will connect to to collect the proxmox dashboard metrics (resources etc.).

[[inputs.smart]]#

This is the smart plugin of Telegraf.

If you have nvme devices, install nvme-cli:

apt install nvme-cli
nvme list

Otherwise, remove the path to nvme.

For allowing the Telegraf user to access smartctl, we need to install sudo and add an entry to the visudo file.

apt-get install sudo
sudo visudo

Add:

# Cmnd alias specification
Cmnd_Alias SMARTCTL = /usr/sbin/smartctl
telegraf  ALL=(ALL) NOPASSWD: SMARTCTL
Defaults!SMARTCTL !logfile, !syslog, !pam_session

These instructions come from Telegraf Issue 8690.2

You will also need to update the device list to capture SMART from. I have an LXI Megaraid MR9260-4i with two Raid 1, 2x Samsung SSD and 2x WD HDD that are directly mounted to the host.

This information can be shown with (e.g.):

cat /proc/scsi/scsi

> Host: scsi0 Channel: 02 Id: 00 Lun: 00
>   Vendor: LSI      Model: MR9260-4i        Rev: 2.13
>   Type:   Direct-Access                    ANSI  SCSI revision: 05
> Host: scsi0 Channel: 02 Id: 01 Lun: 00
>   Vendor: LSI      Model: MR9260-4i        Rev: 2.13
>   Type:   Direct-Access                    ANSI  SCSI revision: 05

Use smartctl to test which settings work for you:

smartctl --scan

> /dev/sda -d scsi # /dev/sda, SCSI device
> /dev/sdb -d scsi # /dev/sdb, SCSI device
> /dev/bus/0 -d megaraid,8 # /dev/bus/0 [megaraid_disk_08], SCSI device
> /dev/bus/0 -d megaraid,9 # /dev/bus/0 [megaraid_disk_09], SCSI device
> /dev/bus/0 -d megaraid,10 # /dev/bus/0 [megaraid_disk_10], SCSI device
> /dev/bus/0 -d megaraid,11 # /dev/bus/0 [megaraid_disk_11], SCSI device

I ignored /dev/sda and /dev/sdb and only selected megaraid devices 8 to 11.

Test sample output:

smartctl -H /dev/bus/0 -d sat+megaraid,8

These commands vary, depending on the hardware config.

The final commands are then entered into the list of the telegraf.conf [[inputs.smart]] section.

devices = [ 
    "/dev/bus/0 -d megaraid,8",
    "/dev/bus/0 -d megaraid,9",
    "/dev/bus/0 -d megaraid,10",
    "/dev/bus/0 -d megaraid,11"]

If you are using a HBA (e.g. for ZFS), you can directly enter paths to drives.

devices = [ 
    "/dev/sdc --all",
    "/dev/sdd --all",
    "/dev/sde --all",
    "/dev/sdf --all",
    "/dev/sdg --all",
    "/dev/sdh --all"]

In this case, I prefer to use symlinks from /dev/disk/by-id/, to avoid switching drive letters.

Example
ls /dev/disk/by-id
devices = [ 
    "/dev/disk/by-id/ata-WDC_WD80EFAX-68KNBN0_VAG9DU9L --all",
    "/dev/disk/by-id/ata-WDC_WD80EFBX-68AZZN0_VRHZHPAK --all",
    "/dev/disk/by-id/ata-WDC_WD80EFBX-68AZZN0_VRJ4GAHK --all",
    "/dev/disk/by-id/ata-WDC_WD80EFZX-68UW8N0_R6GRD6YY --all",
    "/dev/disk/by-id/ata-WDC_WD80EFZX-68UW8N0_R6GRS29Y --all",
    "/dev/disk/by-id/ata-WDC_WD80EFZX-68UW8N0_R6GX82ZY --all",
    "/dev/disk/by-id/ata-WDC_WDS500G1R0A-68A4W0_21270C441210 --all",
    "/dev/disk/by-id/ata-WDC_WDS500G1R0A-68A4W0_21270C441916 --all",
    "/dev/bus/0 -d sat+megaraid,9 --all",
    "/dev/bus/0 -d sat+megaraid,11 --all"
    ]

Finally, if you want to collect all smart attributes (e.g. Total_LBAs_Written):

[[inputs.smart]]
  attributes = true

[[inputs.sensors]]#

In order to monitor sensors, you need lm-sensors.3

This may already be installed on proxmox.

apt-get install lm-sensors watch

You may need to run sensors-detect first, to detect possible sensors:

sudo sensors-detect

Check sensors with:

watch -n 1 sensors

> nct6776-isa-0a30
> Adapter: ISA adapter
> Vcore:          +1.46 V  (min =  +1.02 V, max =  +1.69 V)
> in1:            +1.87 V  (min =  +1.55 V, max =  +2.02 V)
> AVCC:           +3.39 V  (min =  +2.98 V, max =  +3.63 V)
> +3.3V:          +3.38 V  (min =  +2.98 V, max =  +3.63 V)
> in4:            +1.50 V  (min =  +0.97 V, max =  +1.65 V)
> in5:            +1.28 V  (min =  +1.07 V, max =  +1.39 V)
> in6:            +1.46 V  (min =  +0.89 V, max =  +1.23 V)  ALARM
> 3VSB:           +3.36 V  (min =  +2.98 V, max =  +3.63 V)
> Vbat:           +3.15 V  (min =  +2.70 V, max =  +3.63 V)
> fan1:             0 RPM  (min =  712 RPM)  ALARM
> fan2:          3006 RPM  (min =  712 RPM)
> fan3:           898 RPM  (min =  712 RPM)
> fan4:          5152 RPM  (min =  712 RPM)
> fan5:          5232 RPM  (min =  712 RPM)
> SYSTIN:         +44.0°C  (high = +85.0°C, hyst = +80.0°C)  sensor = thermistor
> CPUTIN:         +30.0°C  (high = +85.0°C, hyst = +80.0°C)  sensor = thermistor
> AUXTIN:          +2.0°C  (high = +80.0°C, hyst = +75.0°C)  sensor = thermistor
> PECI Agent 0:    +0.0°C  (high = +80.0°C, hyst = +75.0°C)
>                          (crit = +100.0°C)
> PCH_CHIP_TEMP:   +0.0°C

[[inputs.apcupsd]]#

If you have a UPS, such as from APC, you need to set up the server first, before receiving metrics with Telegraf.

apt-get update
apt-get install apcupsd
# verify connection usb
lsusb
Output
> Bus 003 Device 002: ID 8087:8000 Intel Corp.
> Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
> Bus 001 Device 002: ID 8087:8008 Intel Corp.
> Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
> Bus 004 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
> Bus 002 Device 004: ID 0557:2419 ATEN International Co., Ltd
> Bus 002 Device 003: ID 0557:7000 ATEN International Co., Ltd Hub
> Bus 002 Device 002: ID 051d:0003 American Power Conversion UPS <-- this
> Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
nano /etc/apcupsd/apcupsd.conf
Example config
UPSNAME SRT1000XLI
UPSCABLE usb
UPSTYPE usb
DEVICE
POLLTIME 60

Afterwards, restart apcupsd and verify output:

systemctl restart apcupsd
systemctl status apcupsd.service
/sbin/apcaccess

Then add the corresponding Telegraf plugin for local polling.

[[inputs.apcupsd]]
  # A list of running apcupsd server to connect to.
  # If not provided will default to tcp://127.0.0.1:3551
  servers = ["tcp://127.0.0.1:3551"]

  ## Timeout for dialing server.
  timeout = "5s"

[[inputs.zfs]]#

There is a specific Telegraf plugin available for collecting ZFS stats.

ZFS telegraf.conf section
[[inputs.zfs]]
## ZFS kstat path. Ignored on FreeBSD
## If not specified, then default is:
# kstatPath = "/proc/spl/kstat/zfs"

## By default, telegraf gather all zfs stats
## Override the stats list using the kstatMetrics array:
## For FreeBSD, the default is:
# kstatMetrics = ["arcstats", "zfetchstats", "vdev_cache_stats"]
## For Linux, the default is:
# kstatMetrics = ["abdstats", "arcstats", "dnodestats", "dbufcachestats",
#     "dmu_tx", "fm", "vdev_mirror_stats", "zfetchstats", "zil"]

## By default, don't gather zpool stats
poolMetrics = true

## By default, don't gather dataset stats
datasetMetrics = true

Test#

Test the Telegraf configuration with these commands:

telegraf --debug
sudo -u telegraf telegraf --config /etc/telegraf/telegraf.conf  --test | grep smart

At this stage, I saw socket connection errors.

You can test if restarting the pvestatd service fixes these.

systemctl restart pvestatd

I still saw Socket connection errors in tail --follow /var/log/syslog, but the were gone after a complete reboot of proxmox.

If you later change telegraf.conf, reload Telegraf to apply changes.

systemctl reload telegraf

Configure Dashboard#

Now it is time to head over to your InfluxDB 2.0.

If you visualize data with Grafana, there is not much to do here. But I found the new 2.0 interface already suited to my needs, without requiring Grafana.

Create a Dashboard and then add Proxmox metrics through the Data Explorer.

influxdb2.0

It requires a bit of time to get used to the syntax, but I did not find this terribly complicated. The metrics from Proxmox are largely cryptic, but make sense after careful investigation.

For example, to show the disk read/write performance for each LXC container, use system > diskread/diskwrite > Select LXCs to monitor and then select derivative as the aggregate function, to render the increase of disk r/w in separate time buckets.

influxdb2.0

Here is an example (of type "Graph") to monitor HDD Temperatures and
the corresponding InfluxDB 2.0 Query below.

influxdb2.0

InfluxDB 2.0 Query
from(bucket: "your_bucket")
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) => r["_measurement"] == "smart_device")
|> filter(fn: (r) => r["_field"] == "temp_c")
|> filter(fn: (r) => r["serial_no"] == "R6GRD6YY" or 
                    r["serial_no"] == "R6GX82ZY" or 
                    r["serial_no"] == "VRHZHPAK" or 
                    r["serial_no"] == "VRJ4GAHK" or 
                    r["serial_no"] == "VAG9DU9L" or 
                    r["serial_no"] == "R6GRS29Y" or 
                    r["serial_no"] == "S3YJNF0JC37927V" or 
                    r["serial_no"] == "S3YJNF0JC31937V" or 
                    r["serial_no"] == "21270C441210" or 
                    r["serial_no"] == "21270C441916")
|> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: false)
|> yield(name: "mean")

I used the HDD serial ids, so I can directly identify the physical drive.

More Flux query examples:

Disk Wear (SSD) - extracted from extended Smart Attributes (Single Stat)

Note:

  • attributes = true must be set in telegraf.conf
  • most of these extended attribute names are vendor-specific
  • for instance, Samsung Evo SSDs report Total_LBAs_Written, Western Digital SSDs show Host_Writes_GiB
  • it makes sense to convert the values further to TBW
  • this requires defining a function and providing additional information such as Sector Size 4
  • replace example serial_no below with your disk serial ids
  • Note that these queries can be used directly in Grafana, too

Samsung SSDs:

BYTES_PER_GB=1073741824.0
BYTES_PER_TB=1099511627776.0
LBA_SIZE=512.0

total_lba_written = from(bucket: "your_bucket")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "smart_attribute")
  |> filter(fn: (r) => r["_field"] == "raw_value")
  |> filter(fn: (r) => r["serial_no"] == "S3YJNF1JC37347V")
  |> filter(fn: (r) => r["name"] == "Total_LBAs_Written")
  |> keep(columns:["_time", "_value", "serial_no", "model"])
  |> last()
  |> toFloat()
  |> map(fn: (r) => ({
      r with
      _value: r._value * LBA_SIZE
      })
  )
  |> map(fn: (r) => ({
      r with
      _value: r._value / BYTES_PER_TB
      })
  )
  |> yield(name: "TBW (from LBAs written)")

Western Digital SSDs:

host_writes_gib = from(bucket: "your_bucket")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "smart_attribute")
  |> filter(fn: (r) => r["_field"] == "raw_value")
  |> filter(fn: (r) => r["serial_no"] == "21270C411910")
  |> filter(fn: (r) => r["name"] == "Host_Writes_GiB")
  |> keep(columns:["_time", "_value", "serial_no", "model"])
  |> last()
  |> toFloat()
  |> map(fn: (r) => ({
      r with
      _value: r._value * BYTES_PER_GB
      })
  )
  |> map(fn: (r) => ({
      r with
      _value: r._value / BYTES_PER_TB
      })
  )
  |> yield(name: "TBW (from Host Writes)")

TBW

Disk Error Rate (Graph)

read_error_rate = from(bucket: "monkey")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "smart_device")
  |> filter(fn: (r) => r["_field"] == "read_error_rate")
  |> filter(fn: (r) => r["serial_no"] == "R6GX82ZY" or 
                       r["serial_no"] == "VRHZHPAK" or 
                       r["serial_no"] == "VRJ4GAHK" or 
                       r["serial_no"] == "VAG9DU9L" or 
                       r["serial_no"] == "R6GRS29Y" or 
                       r["serial_no"] == "R6GRS79Y" or 
  |> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: false)
  |> keep(columns:["_time", "_value", "serial_no"])
  |> yield(name: "Read Error Rate (HDD)")

seek_error_rate = from(bucket: "monkey")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "smart_device")
  |> filter(fn: (r) => r["_field"] == "seek_error_rate")
  |> filter(fn: (r) => r["serial_no"] == "R6GX82ZY" or 
                       r["serial_no"] == "VRHZHPAK" or 
                       r["serial_no"] == "VRJ4GAHK" or 
                       r["serial_no"] == "VAG9DU9L" or 
                       r["serial_no"] == "R6GRS29Y" or 
                       r["serial_no"] == "R6GRS79Y" or 
  |> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: false)
  |> keep(columns:["_time", "_value", "serial_no"])
  |> yield(name: "Seek Error Rate (HDD)")

udma_crc_errors = from(bucket: "monkey")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "smart_device")
  |> filter(fn: (r) => r["_field"] == "udma_crc_errors")
  |> filter(fn: (r) => r["serial_no"] == "R6GRL6YY" or 
                       r["serial_no"] == "R6GX84ZY" or 
                       r["serial_no"] == "VRHZHPXK" or 
                       r["serial_no"] == "VRJ0GAHK" or 
                       r["serial_no"] == "VAG9DU9L" or 
                       r["serial_no"] == "R6GRS79Y" or 
  |> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: false)
  |> keep(columns:["_time", "_value", "serial_no"])
  |> yield(name: "UDMA CRC Errors")

This is really only basic visualization, anything more fancy should be done in Grafana.

A final step would be to configure Alerts in InfluxDB 2.0, to get notified when (e.g.) temperature exceed a certain threshold, disks fill up, or the Raid health suddenly changes.

Changelog

2022-01-14

  • Add additional Flux examples (Disk Error, Disk Wear)
  • Add ZFS plugin example
  • Add smart extended attribute collection

2022-01-03 Minor Update:

  • Updated Telegraf install instructions
  • Added example to monitor HDD Temperatures
  • Added Telegraf Smart Config for HBA attached SCSI
  • Add APC plugin instructions

  1. Main source of steps from a blog post from Shift systems 

  2. Instructions for updating sudoers from Telegraf Issue #8690 

  3. Instructions to install lm-sensors from a Reddit post 

  4. Convert Total_LBAs_Written to TBW StackExchange