Document updated on Jan 29, 2024
Telemetry and Monitoring through OpenTelemetry
OpenTelemetry (for short OTEL) offers a comprehensive, unified, and vendor-neutral approach to collecting and managing telemetry data, providing enhanced observability and deeper insights into application performance and behavior. It’s particularly beneficial in complex, distributed, and cloud-native environments.
OpenTelemetry captures detailed, contextual information about the operation of your applications. This includes not only metrics but also tracing data that shows the full lifecycle of requests as they flow through your systems, providing insights into performance bottlenecks, latency issues, and error diagnostics.
It supports auto-instrumentation and can be integrated seamlessly into cloud-native deployments, making it easier to monitor these dynamic environments.
KrakenD has traditionally offered part of its telemetry integration through the OpenCensus integration, which has provided a reliable service for over six years. We are transitioning to the more modern and robust OpenTelemetry framework, and the OpenCensus integration does not receive further updates.
While the underlying protocol specification of OpenTelemetry is stable, you’ll find mixed stability statuses in the components lifecycle. While we cannot predict what changes there will be as the technology evolves, KrakenD will always do its best to maintain compatibility between versions. More information about the underlying exporter can be found here.
Collecting metrics and traces
The telemetry/opentelemetry
component in KrakenD collects the activity generated for the enabled layers and pushes or exposes the data for pulling. There are two ways of publishing metrics:
- OpenTelemetry protocol (OTLP) - push
- Prometheus - pull
You can use both simultaneously if needed, and even multiple instances of each.
When you add OpenTelemetry in the configuration, you will have different metrics available.
Prometheus exporter (pull)
Choose the prometheus
exporter when you want KrakenD to expose a new port offering a /metrics
endpoint. So, an external Prometheus job can connect to a URL like http://krakend:9090/metrics
and retrieve all the data.
See how to configure Prometheus
OTLP exporter (push)
Choose the otlp
exporter when you want to push the metrics to a local or remote collector or directly to a SaaS or storage system that supports native OTLP (there is a large number of supported providers). The following diagram represents this idea:
The host
where your collector lives can also point to an external load balancer between KrakenD and multiple collectors if needed:
Enterprise
users can push directly to external storage passing auth credentials using the telemetry/opentelemetry-security
component, so the collector is not needed anymore:
This strategy saves a lot of time during the setup of KrakenD.
OpenTelemetry Configuration
To enable OpenTelemetry, you will need a Prometheus or an OTEL Collector (or both) and add the telemetry/opentelemetry
namespace at the top level of your configuration.
The configuration of the telemetry/opentelemetry
namespace is very extensive, but the two key entries are:
exporters
, defining the different technologies you will uselayers
, the amount of data you want to report
The entire configuration is as follows:
Fields of OpenTelemetry
exporters
* object- The places where you will send telemetry data. You can declare multiple exporters even when they are of the same type. For instance, when you have a self-hosted Grafana and would like to migrate to its cloud version and check the double reporting during the transition. There are two families of exporters:
otlp
orprometheus
.otlp
array- The list of OTLP exporters you want to use. Set at least one object to push metrics and traces to an external collector using OTLP.Each item is an object with the following properties:
disable_metrics
boolean- Disable metrics in this exporter (leaving only traces if any). It won’t report any metrics when the flag is
true
.Defaults tofalse
disable_traces
boolean- Disable traces in this exporter (leaving only metrics if any). It won’t report any metrics when the flag is
true
.Defaults tofalse
host
string- The host where you want to push the data. It can be a sidecar or a remote collector.
name
string- A unique name to identify this exporter.Examples:
"local_prometheus"
,"remote_grafana"
port
integer- A custom port to send the data. The port defaults to 4317 for gRPC unless you enable
use_http
, which defaults to 4318.Defaults to4317
use_http
boolean- Whether this exporter uses HTTP instead of gRPC.
prometheus
array- Set here at least the settings for one Prometheus exporter. Each exporter will start a local port that offers metrics to be pulled from KrakenD.Each item is an object with the following properties:
disable_metrics
boolean- Leave this exporter declared but disabled (useful in development). It won’t report any metrics when the flag is
true
.Defaults tofalse
go_metrics
boolean- Whether you want fine-grained details of Go language metrics or not.
listen_ip
string- The IP address that KrakenD listens to in IPv4 or IPv6. You can, for instance, expose the Prometheus metrics only in a private IP address. An empty string, or no declaration means listening on all interfaces. The inclusion of
::
is intended for IPv6 format only (this is not the port). Examples of valid addresses are192.0.2.1
(IPv4),2001:db8::68
(IPv6). The values::
and0.0.0.0
listen to all addresses, which are valid for IPv4 and IPv6 simultaneously.Examples:"172.12.1.1"
,"::1"
Defaults to"0.0.0.0"
name
string- A unique name to identify this exporter.Examples:
"local_prometheus"
,"remote_grafana"
port
integer- The port in KrakenD where Prometheus will connect to.Defaults to
9090
process_metrics
boolean- Whether this exporter shows detailed metrics about the running process like CPU or memory usage or not.
layers
object- A request and response flow passes through three different layers. This attribute lets you specify what data you want to export in each layer. All layers are enabled by default unless you declare this section.
backend
object- Reports the activity between KrakenD and each of your backend services. This is the more granular layer.
metrics
objectdetailed_connection
boolean- Whether you want to enable detailed metrics for the HTTP connection phase or not. Includes times to connect, DNS querying, and the TLS handshake.Defaults to
false
disable_stage
boolean- Whether to turn off the metrics or not. Setting this to
true
means stop reporting any data.Defaults tofalse
read_payload
boolean- Whether you want to enable metrics for the response reading payload or not (HTTP connection not taken into account).Defaults to
false
round_trip
boolean- Whether you want to enable metrics for the actual HTTP request for the backend or not (manipulation not taken into account). This is the time your backend needs to produce a result.Defaults to
false
static_attributes
array- A list of tags or labels you want to associate with these metrics.Example:
[{"key":"my_metric_attr","value":"my_metric_val"}]
traces
objectdetailed_connection
boolean- Whether you want to add detailed trace attributes for the HTTP connection phase or not. Includes times to connect, DNS querying, and the TLS handshake.Defaults to
false
disable_stage
boolean- Whether to turn off the traces or not. Setting this to
true
means stop reporting any data.Defaults tofalse
read_payload
boolean- Whether you want to add trace attributes for the response reading payload or not (HTTP connection not taken into account).Defaults to
false
report_headers
boolean- Whether you want to report the final headers that reached the backend.Defaults to
false
round_trip
boolean- Whether you want to add trace attributes for the actual HTTP request for the backend or not (manipulation not taken into account). This is the time your backend needs to produce a result.Defaults to
false
static_attributes
array- A list of tags or labels you want to associate to these traces.Example:
[{"key":"my_trace_attr","value":"my_trace_val"}]
global
object- Reports the activity between end-users and KrakenD
disable_metrics
boolean- Whether you want to disable all metrics happening in the global layer or not.Defaults to
false
disable_propagation
boolean- Whether you want to ignore previous propagation headers to KrakenD. When the flag is set to
true
, spans from a previous layer will never be linked to the KrakenD trace.Defaults tofalse
disable_traces
boolean- Whether you want to disable all traces happening in the global layer or not.Defaults to
false
metrics_static_attributes
array- Static attributes you want to pass for metrics.
report_headers
boolean- Whether you want to send all headers that the consumer passed in the request or not.Defaults to
false
traces_static_attributes
array- Static attributes you want to pass for traces.
proxy
object- Reports the activity at the beginning of the proxy layer, including spawning the required requests to multiple backends, merging, endpoint transformation and any other internals of the proxy between the request processing and the backend communication
disable_metrics
boolean- Whether you want to disable all metrics happening in the proxy layer or not.Defaults to
false
disable_traces
boolean- Whether you want to disable all traces happening in the proxy layer or not.Defaults to
false
metrics_static_attributes
array- Static attributes you want to pass for metrics.
report_headers
boolean- Whether you want to report all headers that passed from the request to the proxy layer (
input_headers
policy in the endpoint plus KrakenD’s headers).Defaults tofalse
traces_static_attributes
array- Static attributes you want to pass for traces.
metric_reporting_period
integer- How often you want to report and flush the metrics in seconds. This setting is only used by
otlp
exporters.Defaults to30
service_name
string- A friendly name identifying metrics reported by this installation. When unset, it uses the
name
attribute in the root level of the configuration. service_version
string- The version you are deploying, this can be useful for deployment tracking.
skip_paths
array- The paths you don’t want to report. Use the literal value used in the
endpoint
definition, including any{placeholders}
. In theglobal
layer, this attribute works only on metrics, because traces are initiated before there is an endpoint to match against. If you do not want any path skipped, just add an array with an empty string[""]
.Example:"/foo/{bar}"
Defaults to["/__health","/__debug/","/__echo/","/__stats/"]
trace_sample_rate
number- The sample rate for traces defines the percentage of reported traces. This option is key to reduce the amount of data generated (and resource usage), while you still can debug and troubleshoot issues. For instance, a number of
0.25
will report a 25% of the traces seen in the system.Example:0.25
Defaults to1
Here’s an example with a Grafana Tempo and a Prometheus.
{
"version": 3,
"$schema": "https://www.krakend.io/schema/v2.8/krakend.json",
"extra_config": {
"telemetry/opentelemetry": {
"service_name": "krakend_middle_service",
"service_version": "commit-sha-ACBDE1234",
"exporters": {
"prometheus": [
{
"name": "my_prometheus",
"port": 9092,
"listen_ip": "::1",
"process_metrics": false,
"go_metrics": false
}
],
"otlp": [
{
"name": "local_tempo",
"host": "localhost",
"port": 4317,
"use_http": false
}
]
},
"layers": {
"global": {
"disable_metrics": false,
"disable_traces": false,
"disable_propagation": false
},
"proxy": {
"disable_metrics": false,
"disable_traces": false
},
"backend": {
"metrics": {
"disable_stage": false,
"round_trip": true,
"read_payload": true,
"detailed_connection": true,
"static_attributes": [
{
"key": "my_metric_attr",
"value": "my_middle_metric"
}
]
},
"traces": {
"disable_stage": false,
"round_trip": true,
"read_payload": true,
"detailed_connection": true,
"static_attributes": [
{
"key": "my_metric_attr",
"value": "my_middle_metric"
}
]
}
}
},
"skip_paths": [
"/foo/{bar}"
]
}
}
}
Examples of integrations
- Push metrics to InfluxDB
- Pull metrics from Prometheus
- Push metrics to Datadog
- Push metrics to Zipkin
- Push metrics to Jaeger
- Push metrics to Azure Monitor
Contribute to KrakenD Documentation. Improve this page »