Evaluate your SIEM
Get the guideComplete visibility for DevSecOps
Reduce downtime and move from reactive to proactive monitoring.
April 19, 2021
Telegraf is a server-based agent for collecting all kinds of metrics for further processing. It’s a piece of software that you can install anywhere in your infrastructure and it will read metrics from specified sources – typically application logs, events, or data outputs.
It consists of the main process and a convenient plugin ecosystem that mixes input and output services. For example, you can use compatible server plugins to collect metrics and send them to compatible outputs such as other datastores or services. These plugins use the Influx protocol line format, which defines a simple yet functional format for working with metric points. The Telegraf agent then acts as an adapter and streams metrics from various sources into registered outlets.
Read along to see how you can use Telegraf to collect and push application performance metrics locally or in the cloud.
Before you can use Telegraf, you need to install it somewhere. Fortunately for you, there are lots of deployment options since the tool itself is written in Go. The latest Telegraf release is v1.17.0, which you can download from the official downloads page. Here’s how you can install it locally and with Docker:
You can install Telegraf using a .deb file on Linux or an .exe on Windows. On a Mac, you would use the brew installer, which is as simple as:
$ brew update $ brew install telegraf
You can review the default Telegraf config file to get an idea of its format as follows:
$ cat /usr/local/etc/telegraf.conf | less
To test the binary, you can use the following command:
$ telegraf --test 2021-01-25T15:42:02Z I! Starting Telegraf 1.17.0 2021-01-25T15:42:02Z E! [telegraf] Error running agent: No config file specified, and could not find one in $TELEGRAF_CONFIG_PATH, /Users/theo.despoudis/.telegraf/telegraf.conf, or /etc/telegraf/telegraf.conf
This immediately flags issues with a missing config file. You can just copy the existing config and add an example collector like this:
$ cp /usr/local/etc/telegraf.conf . $ export TELEGRAF_CONFIG_PATH=$(pwd)/telegraf.conf
Add the following lines to register an input collector for a PostgreSQL database that you run locally:
[[inputs.postgresql]] address = "postgresql://postgres:password@localhost/my_app_development"
If you have MySQL, you can use the [[inputs.mysql]] config instead.
Testing the binary again shows that the input collectors are performing well:
❯ telegraf --test 2021-01-25T15:55:28Z I! Starting Telegraf 1.17.0 2021-01-25T15:55:28Z I! Using config file: /Users/theo.despoudis/Workspace/telegraf-example/telegraf.conf > mem,host=theo-despoudis active=14726529024i,available=15191564288i,available_percent=44.213271141052246,free=1435623424i,inactive=13755940864i,total=34359738368i,used=19168174080i,used_percent=55.786728858947754,wired=4406173696i 1611590358000000000 ... > postgresql,db=postgres,host=theo-despoudis,server=dbname\=my_app_development\ host\=localhost\ user\=postgres blk_read_time=0,blk_write_time=0,blks_hit=2642i,blks_read=207i,conflicts=0i,datid=16404i,datname="postgres",deadlocks=0i,numbackends=1i,temp_bytes=0i,temp_files=0i,tup_deleted=0i,tup_fetched=1388i,tup_inserted=0i,tup_returned=7455i,tup_updated=0i,xact_commit=21i,xact_rollback=0i 1611590358000000000 …
From these logs, you can see that the memory (mem), cpu, disk, and postgresql input tags identify the input collectors. After the tags, you can see relevant information about the metrics that were collected.
You can also install Telegraf on Docker using the official image:
$ docker run -v $PWD/telegraf.conf:/etc/telegraf/telegraf.conf:ro telegraf
If you run into issues where Telegraf complains that it cannot connect to InfluxDB, you may need to comment out the empty configuration for [[outputs.influxdb]] and add a file output instead:
[[outputs.file]] files = ["stdout"]
The Telegraf config has many options and configuration parameters. For example, you can configure the frequency with which Telegraf will transmit the data, use different protocols (UDP), or prepend extra tags.
If you are interested in the complete list of input collector plugins and their configurations, you can visit this page.
In this part of the tutorial, we are going to collect performance metrics from a Rails application and send them to Telegraf. You aren’t required to use InfluxDB to store metrics, since that’s part of the TICK stack (Telegraf, InfluxDB, Chronograf, and Kapacitor). This stack is excellent because it’s an open-source and developer-friendly way to run a complete monitoring stack with no upfront costs.
First, we add this gem dependency in our Gemfile:
gem 'telegraf'
Then we install the bundle:
$ bundle install
You need to open a server port for accepting incoming UDP connections before you restart Telegraf. To do that, add this configuration to the telegraf.conf file:
[[inputs.socket_listener]] service_address = "udp://:8094"
Once you restart the agent, you can verify that the port is open and accepting connections:
2021-01-25T19:27:58Z I! [inputs.socket_listener] Listening on udp://[::]:8094 ❯ lsof -i udp:8094 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME telegraf 50307 theo.despoudis 10u IPv6 0x94fb417809f5003f 0t0 UDP *:8094
Now we need to configure the Rails application to log requests with the Telegraf agent. First, you need to add the configuration values in application.rb:
require 'telegraf/railtie' module MyApp class Application < Rails::Application config.telegraf.connect = 'udp://localhost:8094' config.telegraf.rack.enabled = true config.telegraf.rack.series = 'requests' config.telegraf.rack.tags = {} config.telegraf.active_job.enabled = true config.telegraf.active_job.series = "active_job" config.telegraf.active_job.tags = {} End # Other config values end
This Railtie provides all the necessary hooks and initializers to connect and log requests in Rails and send them to the Telegraf agent.
Now you can start the server and see the logs in the Telegraf console:
requests,host=theo-despoudis,status=200 app_ms=1096.0680000134744,send_ms=0.7249999907799065,request_ms=1096.7960000270978 1611604693530085000 requests,host=theo-despoudis,status=200 app_ms=1386.7479999898933,send_ms=1.8629999831318855,request_ms=1388.6119999806397 1611604693826514000 requests,host=theo-despoudis,status=200 app_ms=1375.6130000110716,send_ms=14.771999965887517,request_ms=1390.3859999845736 1611604693826616000 requests,host=theo-despoudis,status=200 app_ms=1526.415000029374,send_ms=4.373999952804297,request_ms=1530.7920000050217 1611604693968698000 requests,host=theo-despoudis,status=200 app_ms=1516.6709999903105,send_ms=181.5570000326261,request_ms=1698.2309999875724 1611604694133314000 requests,host=theo-despoudis,status=200 app_ms=448.731999960728,send_ms=0.47400000039488077,request_ms=449.20899998396635 1611604694444767000
For each line, you see the tag name that we defined in the config (requests), the host name, the status code, and some requested performance timings.
Telegraf has a simple API. You just need connection credentials and you can send the metrics using the Influx line protocol. There are also community plugins for Python and Rust, and there is the Jolokia2 Agent for Java.
To view your Telegraf metrics in Sumo Logic you need to use the Sumo Logic Output Plugin.
To use this collector, you need to have a Hosted Collector available and configured with an HTTP Source of metrics.
Then you need to connect Telegraf with Sumo Logic Source by adding the following configuration:
[[outputs.sumologic]] url = "https://events.sumologic.net/receiver/v1/http/<HTTPSourceCode>" data_format = "carbon2"
Once you’ve configured Telegraf to collect and transmit your metrics to Sumo Logic, you can log into your account to view your data. Just navigate to the App Catalog and select your application. You can use the Sumo Logic Query language to perform familiar queries, performance monitoring, and custom visualizations.
The diagram below illustrates where Telegraf fits into a Kubernetes environment monitored by Sumo Logic. In this example, we’re monitoring an NGINX deployment in a Kubernetes cluster using both Prometheus and FluentD to make up the metrics collection pipeline. The cluster contains two nodes each with NGINX containers.
The first service in the pipeline is Telegraf, which collects metrics from NGINX. In this case, we’re running Telegraf in each pod we want to collect metrics from. Telegraf uses an input plugin to obtain metrics, in this case, the NGINX input plugin.
The Sumo Logic Helm chart for Kubernetes collection packages all of these components up as part of the collection process for the Sumo Logic Kubernetes Solution.
We’ve only shown you a simplified example with one Telegraf agent, and in a real production environment, you may have to configure Telegraf agents on each node. Each agent would then collect and stream logs into centralized collectors that can ingest a serious amount of data. The Sumo Logic Output plugin is an excellent resource for this.
Sumo Logic is a trusted cloud monitoring and observability platform that can meet the needs of all kinds of enterprises, from SMAs to Conglomerate class. If you use Telegraf as a metrics collector but need a better and more seamless experience when analyzing metrics, you can take advantage of Sumo Logic’s free trial. Sign up here and see what they have to offer.
Reduce downtime and move from reactive to proactive monitoring.
Build, run, and secure modern applications and cloud infrastructures.
Start free trial