Adobe's central observability team built a three-tier OpenTelemetry Collector pipeline running thousands of collectors at scale across the company's Kubernetes infrastructure. The architecture uses a user-facing Helm chart with a locked-down sidecar collector and a configurable deployment collector, a centralized managed namespace with per-signal (metrics, logs, traces) collector deployments for fault isolation, and multiple observability backends. Auto-instrumentation via the OpenTelemetry Operator requires just two annotation lines. A custom circuit-breaker extension was built to propagate backend authentication errors upstream through chained collectors, solving a key error-visibility gap. Key lessons include treating OpenTelemetry as a platform to extend, designing for user simplicity, and planning for error visibility in chained collector setups.

8m read timeFrom opentelemetry.io
Post cover image
Table of contents
Organizational structureOpenTelemetry adoptionArchitecture: a three-tier collector pipelineAuto-instrumentation: two lines and it worksCustom distribution and componentsDeployment and lifecycle managementWhat works wellAdvice for othersWhat’s next

Sort: