Document updated on Jan 29, 2024
Since | v2.6 |
---|---|
Namespace | telemetry/opentelemetry |
Log prefix | [SERVICE: OpenTelemetry] |
Scope | service |
Source | krakend/krakend-otel |
OpenTelemetry (for short OTEL) offers a comprehensive, unified, and vendor-neutral approach to collecting and managing telemetry data, providing enhanced observability and deeper insights into application performance and behavior. It’s particularly beneficial in complex, distributed, and cloud-native environments.
OpenTelemetry captures detailed, contextual information about the operation of your applications. This includes not only metrics but also tracing data that shows the full lifecycle of requests as they flow through your systems, providing insights into performance bottlenecks, latency issues, and error diagnostics.
It supports auto-instrumentation and can be integrated seamlessly into cloud-native deployments, making it easier to monitor these dynamic environments.
KrakenD has traditionally offered part of its telemetry integration through the OpenCensus integration, which has provided a reliable service for over six years. We are transitioning to the more modern and robust OpenTelemetry framework, and the OpenCensus integration does not receive further updates.
While the underlying protocol specification of OpenTelemetry is stable, you’ll find mixed stability statuses in the components lifecycle. While we cannot predict what changes there will be as the technology evolves, KrakenD will always do its best to maintain compatibility between versions. More information about the underlying exporter can be found here.
The telemetry/opentelemetry
component in KrakenD collects the activity generated for the enabled layers and pushes or exposes the data for pulling. There are two ways of publishing metrics:
You can use both simultaneously if needed, and even multiple instances of each.
When you add OpenTelemetry in the configuration, you will have different metrics available.
Choose the prometheus
exporter when you want KrakenD to expose a new port offering a /metrics
endpoint. So, an external Prometheus job can connect to a URL like http://krakend:9090/metrics
and retrieve all the data.
See how to configure Prometheus
Choose the otlp
exporter when you want to push the metrics to a local or remote collector or directly to a SaaS or storage system that supports native OTLP (there is a large number of supported providers). The following diagram represents this idea:
The host
where your collector lives can also point to an external load balancer between KrakenD and multiple collectors if needed:
Enterprise
users can push directly to external storage passing auth credentials using the telemetry/opentelemetry-security
component, so the collector is not needed anymore:
This strategy saves a lot of time during the setup of KrakenD.
To enable OpenTelemetry, you will need a Prometheus or an OTEL Collector (or both) and add the telemetry/opentelemetry
namespace at the top level of your configuration.
The configuration of the telemetry/opentelemetry
namespace is very extensive, but the two key entries are:
exporters
, defining the different technologies you will uselayers
, the amount of data you want to reportThe entire configuration is as follows:
| The places where you will send telemetry data. You can declare multiple exporters even when they are of the same type. For instance, when you have a self-hosted Grafana and would like to migrate to its cloud version and check the double reporting during the transition. There are two families of exporters: otlp or prometheus .
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| A request and response flow passes through three different layers. This attribute lets you specify what data you want to export in each layer. All layers are enabled by default unless you declare this section.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| How often you want to report and flush the metrics in seconds. This setting is only used by otlp exporters.Defaults to 30 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| A friendly name identifying metrics reported by this installation. When unset, it uses the name attribute in the root level of the configuration. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| The version you are deploying, this can be useful for deployment tracking. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| The paths you don’t want to report. Use the literal value used in the endpoint definition, including any {placeholders} . In the global layer, this attribute works only on metrics, because traces are initiated before there is an endpoint to match against. If you do not want any path skipped, just add an array with an empty string [""] .Example: "/foo/{bar}" Defaults to ["/__health","/__debug/","/__echo/","/__stats/"] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| The sample rate for traces defines the percentage of reported traces. This option is key to reduce the amount of data generated (and resource usage), while you still can debug and troubleshoot issues. For instance, a number of 0.25 will report a 25% of the traces seen in the system.Example: 0.25 Defaults to 1 |
Here’s an example with a Grafana Tempo and a Prometheus.
{
"version": 3,
"$schema": "https://www.krakend.io/schema/v2.7/krakend.json",
"extra_config": {
"telemetry/opentelemetry": {
"service_name": "krakend_middle_service",
"service_version": "commit-sha-ACBDE1234",
"exporters": {
"prometheus": [
{
"name": "my_prometheus",
"port": 9092,
"listen_ip": "::1",
"process_metrics": false,
"go_metrics": false
}
],
"otlp": [
{
"name": "local_tempo",
"host": "localhost",
"port": 4317,
"use_http": false
}
]
},
"layers": {
"global": {
"disable_metrics": false,
"disable_traces": false,
"disable_propagation": false
},
"proxy": {
"disable_metrics": false,
"disable_traces": false
},
"backend": {
"metrics": {
"disable_stage": false,
"round_trip": true,
"read_payload": true,
"detailed_connection": true,
"static_attributes": [
{
"key": "my_metric_attr",
"value": "my_middle_metric"
}
]
},
"traces": {
"disable_stage": false,
"round_trip": true,
"read_payload": true,
"detailed_connection": true,
"static_attributes": [
{
"key": "my_metric_attr",
"value": "my_middle_metric"
}
]
}
}
},
"skip_paths": [
"/foo/{bar}"
]
}
}
}
The documentation is only a piece of the help you can get! Whether you are looking for Open Source or Enterprise support, see more support channels that can help you.