OpenTelemetry
Observability is consistently one of the top feature requests by customers. Valkey GLIDE 2.0 introduces support for OpenTelemetry (OTel), enabling developers to gain deep insights into client-side performance and behavior in distributed systems.
OTel is an open source, vendor-neutral framework that provides APIs, SDKs, and tools for generating, collecting, and exporting telemetry data—such as traces, metrics, and logs. It supports multiple programming languages and integrates with various observability backends like Prometheus, Jaeger, and AWS CloudWatch.
How It Works
Section titled “How It Works”GLIDE’s OpenTelemetry integration is designed to be both powerful and easy to adopt. Once an OTel collector endpoint is configured, GLIDE begins emitting default metrics and traces automatically—no additional code changes are required. This simplifies the path to observability best practices and minimizes disruption to existing workflows.
Metrics Overview
Section titled “Metrics Overview”GLIDE emits several built-in metrics out of the box. These metrics can be used to build dashboards, configure alerts, and monitor performance trends:
- Timeouts: Number of requests that exceeded their timeout duration.
- Retries: Count of operations retried due to transient errors or topology changes.
- Moved Errors: Number of MOVED responses received, indicating key reallocation in the cluster.
These metrics are emitted to your configured OpenTelemetry collector and can be viewed in any supported backend (Prometheus, CloudWatch, etc.).
Tracing Integration
Section titled “Tracing Integration”GLIDE creates a trace span for each Valkey command, giving detailed visibility into client-side performance. Each trace captures:
- The entire command lifecycle: from creation to completion or failure.
- A nested
send_commandspan, measuring communication time with the Valkey server. - A status tag indicating success or error for each span, helping you identify failure patterns.
This distinction helps developers separate client-side queuing latency from server communication delays, making it easier to troubleshoot performance issues.
Even with these exceptions, GLIDE 2.0 provides comprehensive insights across the vast majority of standard operations, making it easy to adopt observability best practices with minimal effort.
Getting Started
Section titled “Getting Started”To begin collecting telemetry data with GLIDE 2.0:
- Set up an OpenTelemetry Collector to receive trace and metric data.
- Configure the GLIDE client with the endpoint to your collector.
- Alternatively, you can configure GLIDE to export telemetry data directly to a local file for development or debugging purposes, without requiring a running collector.
GLIDE does not export data directly to third-party services—instead, it sends data to your collector, which routes it to your backend (e.g., CloudWatch, Prometheus, Jaeger).
Example
Section titled “Example”from glide import OpenTelemetry, OpenTelemetryConfig, OpenTelemetryTracesConfig, OpenTelemetryMetricsConfig
OpenTelemetry.init(OpenTelemetryConfig( traces=OpenTelemetryTracesConfig( endpoint="http://localhost:4318/v1/traces", sample_percentage=10 # Optional, defaults to 1. Can also be changed at runtime via set_sample_percentage(). ), metrics=OpenTelemetryMetricsConfig( endpoint="http://localhost:4318/v1/metrics" ), flush_interval_ms=1000 # Optional, defaults to 5000))import glide.api.OpenTelemetry;
OpenTelemetry.init( OpenTelemetry.OpenTelemetryConfig.builder() .traces( OpenTelemetry.TracesConfig.builder() .endpoint("http://localhost:4318/v1/traces") .samplePercentage(10) // Optional, defaults to 1. Can also be changed at runtime via setSamplePercentage(). .build() ) .metrics( OpenTelemetry.MetricsConfig.builder() .endpoint("http://localhost:4318/v1/metrics") .build() ) .flushIntervalMs(1000L) // Optional, defaults to 5000 .build());import { OpenTelemetry, OpenTelemetryConfig, OpenTelemetryTracesConfig, OpenTelemetryMetricsConfig } from "@valkey/valkey-glide";
// Define traces configurationconst tracesConfig: OpenTelemetryTracesConfig = { endpoint: "http://localhost:4318/v1/traces", samplePercentage: 10 // Optional, defaults to 1%};
// Define metrics configurationconst metricsConfig: OpenTelemetryMetricsConfig = { endpoint: "http://localhost:4318/v1/metrics"};
// Complete OpenTelemetry configurationconst openTelemetryConfig: OpenTelemetryConfig = { traces: tracesConfig, // Optional: can omit if only metrics are needed metrics: metricsConfig, // Optional: can omit if only traces are needed flushIntervalMs: 1000 // Optional, defaults to 5000 ms};
// Initialize OpenTelemetry (can only be called once per process)OpenTelemetry.init(openTelemetryConfig);import "github.com/valkey-io/valkey-glide/go/v2"
interval := int64(1000)config := glide.OpenTelemetryConfig{ Traces: &glide.OpenTelemetryTracesConfig{ Endpoint: "http://localhost:4318/v1/traces", SamplePercentage: 10, // Optional, defaults to 1. Can also be changed at runtime via `SetSamplePercentage()` }, Metrics: &glide.OpenTelemetryMetricsConfig{ Endpoint: "http://localhost:4318/v1/metrics", }, FlushIntervalMs: &interval, // Optional, defaults to 5000}err := glide.GetOtelInstance().Init(config)if err != nil { log.Fatalf("Failed to initialize OpenTelemetry: %v", err)}Configuration Options
Section titled “Configuration Options”When initializing OpenTelemetry, you can customize behavior using the configuration object.
| Configuration | Type | Required | Default | Description |
|---|---|---|---|---|
| traces.endpoint | String | Yes (if traces enabled) | - | The trace collector endpoint URL. Supports http://, https://, grpc://, or file:// protocols. |
| traces.samplePercentage | Integer | No | 1 | Percentage (0–100) of commands to sample for tracing. For production, a low sampling rate (1–5%) is recommended to balance performance and insight. Can be changed at runtime. |
| metrics.endpoint | String | Yes (if metrics enabled) | - | The metrics collector endpoint URL. Supports http://, https://, grpc://, or file:// protocols. |
| flushIntervalMs | Integer | No | 5000 | Time in milliseconds between flushes to the collector. Must be a positive integer. |
Supported Collector Protocols
Section titled “Supported Collector Protocols”You can configure the OTel collector endpoint using one of the following protocols:
http://orhttps://- Send data via HTTP(S)grpc://- Use gRPC for efficient telemetry transmissionfile://- Write telemetry data to a local file (ideal for local dev/debugging)
File Exporter Details
Section titled “File Exporter Details”If using file:// as the endpoint:
- The path must begin with
file://. - If a directory is provided (or no file extension), data is written to
signals.jsonin that directory. - If a filename is included, it will be used as-is.
- The parent directory must already exist.
- Data is appended, not overwritten.
Validation Rules
Section titled “Validation Rules”- Flush interval must be a positive integer.
- Sample percentage must be between 0 and 100.
- File exporter paths must start with
file://and have an existing parent directory. - Invalid configuration will throw an error synchronously when calling initialization.