Cloud-Native Observability Stack Part 1 - OpenTelemetry Instrumentation in Spring Boot
Series Introduction
This series covers how to build a cloud-native Observability stack needed in production environments.
- Part 1: OpenTelemetry Instrumentation (Current)
- Part 2: Distributed Tracing Across Microservices
- Part 3: Structured Logging with Correlation IDs
- Part 4: Metrics and Alerting with Prometheus/Grafana
- Part 5: Debugging Production Issues with Observability Data
The Black Box Problem in Microservices
A user reports “Payment isn’t working.”
You check the logs. API Gateway logs are fine. Order service is fine. Payment service… there’s an error log, but you can’t tell if it’s from that user’s request.
This is a critical weakness of microservices. As requests pass through multiple services, traceability disappears.
The Three Pillars of Observability
- Traces: The journey of a request through a distributed system
- Metrics: Numerical measurements of the system
- Logs: Records of events occurring in the system
OpenTelemetry is the standard that unifies all three.
Introduction to OpenTelemetry
OpenTelemetry (OTel) is a vendor-neutral open-source project for collecting and exporting traces, metrics, and logs.
Core Concepts
- Span: A unit of work (e.g., HTTP request handling, DB query)
- Trace: A tree structure of related Spans
- Context: Metadata that maintains relationships between Spans
- Exporter: Sends collected data to backends
Setting Up OpenTelemetry in Spring Boot 4.x
Adding Dependencies
// build.gradle.kts
dependencies {
// Spring Boot Actuator
implementation("org.springframework.boot:spring-boot-starter-actuator")
// OpenTelemetry Spring Boot Starter (Spring Boot 3.x+)
implementation("io.opentelemetry.instrumentation:opentelemetry-spring-boot-starter:2.1.0")
// OTLP Exporter
implementation("io.opentelemetry:opentelemetry-exporter-otlp:1.34.0")
// Micrometer - OpenTelemetry Bridge
implementation("io.micrometer:micrometer-tracing-bridge-otel:1.2.2")
}
Application Configuration
# application.yml
spring:
application:
name: order-service
otel:
exporter:
otlp:
endpoint: http://localhost:4317
protocol: grpc
resource:
attributes:
service.name: order-service
service.version: 1.0.0
deployment.environment: production
instrumentation:
spring-webmvc:
enabled: true
spring-webflux:
enabled: true
jdbc:
enabled: true
kafka:
enabled: true
management:
tracing:
sampling:
probability: 1.0 # Recommend 0.1 (10%) for production
otlp:
tracing:
endpoint: http://localhost:4317
Auto-Instrumentation
The OpenTelemetry Java Agent automatically instruments your application without code changes.
Using the Java Agent
# Download Java Agent
wget https://github.com/open-telemetry/opentelemetry-java-instrumentation/releases/latest/download/opentelemetry-javaagent.jar
# Run
java -javaagent:opentelemetry-javaagent.jar \
-Dotel.service.name=order-service \
-Dotel.exporter.otlp.endpoint=http://localhost:4317 \
-jar order-service.jar
Auto-Instrumentation Targets
- HTTP clients/servers (RestTemplate, WebClient, Spring MVC)
- Databases (JDBC, R2DBC, JPA)
- Messaging (Kafka, RabbitMQ)
- Cache (Redis)
- gRPC
- Many other libraries
Manual Instrumentation
Use manual instrumentation when you need detailed observation of business logic.
Tracer Configuration
@Configuration
class TracingConfig {
@Bean
fun tracer(openTelemetry: OpenTelemetry): Tracer {
return openTelemetry.getTracer("order-service", "1.0.0")
}
}
Manual Span Creation
@Service
class OrderService(
private val tracer: Tracer,
private val orderRepository: OrderRepository,
private val paymentClient: PaymentClient,
private val inventoryClient: InventoryClient
) {
fun createOrder(request: CreateOrderRequest): Order {
// Create parent Span
val span = tracer.spanBuilder("createOrder")
.setSpanKind(SpanKind.INTERNAL)
.setAttribute("order.customer_id", request.customerId)
.setAttribute("order.item_count", request.items.size.toLong())
.startSpan()
return try {
span.makeCurrent().use { scope ->
// Check inventory (child Span)
val inventory = checkInventory(request.items)
// Create order (child Span)
val order = saveOrder(request)
// Process payment (child Span)
processPayment(order)
span.setAttribute("order.id", order.id)
span.setStatus(StatusCode.OK)
order
}
} catch (e: Exception) {
span.setStatus(StatusCode.ERROR, e.message ?: "Unknown error")
span.recordException(e)
throw e
} finally {
span.end()
}
}
private fun checkInventory(items: List<OrderItem>): InventoryResult {
val span = tracer.spanBuilder("checkInventory")
.setSpanKind(SpanKind.CLIENT)
.startSpan()
return try {
span.makeCurrent().use {
val result = inventoryClient.checkAvailability(items)
span.setAttribute("inventory.available", result.isAvailable)
result
}
} finally {
span.end()
}
}
private fun saveOrder(request: CreateOrderRequest): Order {
val span = tracer.spanBuilder("saveOrder")
.setAttribute("db.system", "postgresql")
.setAttribute("db.operation", "INSERT")
.startSpan()
return try {
span.makeCurrent().use {
orderRepository.save(Order.create(request))
}
} finally {
span.end()
}
}
private fun processPayment(order: Order) {
val span = tracer.spanBuilder("processPayment")
.setSpanKind(SpanKind.CLIENT)
.setAttribute("payment.amount", order.totalAmount.toDouble())
.startSpan()
try {
span.makeCurrent().use {
paymentClient.charge(order.customerId, order.totalAmount)
span.setStatus(StatusCode.OK)
}
} catch (e: PaymentException) {
span.setStatus(StatusCode.ERROR, "Payment failed")
span.recordException(e)
throw e
} finally {
span.end()
}
}
}
Annotation-Based Instrumentation
@Aspect
@Component
class TracingAspect(private val tracer: Tracer) {
@Around("@annotation(traced)")
fun traceMethod(joinPoint: ProceedingJoinPoint, traced: Traced): Any? {
val methodName = joinPoint.signature.name
val className = joinPoint.target.javaClass.simpleName
val span = tracer.spanBuilder("$className.$methodName")
.setSpanKind(SpanKind.INTERNAL)
.startSpan()
return try {
span.makeCurrent().use {
joinPoint.proceed()
}
} catch (e: Exception) {
span.setStatus(StatusCode.ERROR, e.message ?: "Error")
span.recordException(e)
throw e
} finally {
span.end()
}
}
}
@Target(AnnotationTarget.FUNCTION)
@Retention(AnnotationRetention.RUNTIME)
annotation class Traced(val operationName: String = "")
Usage example:
@Service
class PaymentService {
@Traced
fun processRefund(orderId: String, amount: BigDecimal) {
// Span is automatically created
}
}
Context Propagation
Propagation via HTTP Headers
@Configuration
class RestTemplateConfig {
@Bean
fun restTemplate(openTelemetry: OpenTelemetry): RestTemplate {
val restTemplate = RestTemplate()
// Configure Context Propagator
restTemplate.interceptors.add { request, body, execution ->
val context = Context.current()
openTelemetry.propagators.textMapPropagator.inject(
context,
request.headers
) { carrier, key, value ->
carrier?.set(key, value)
}
execution.execute(request, body)
}
return restTemplate
}
}
Propagation via Kafka
@Configuration
class KafkaTracingConfig {
@Bean
fun kafkaTemplate(
producerFactory: ProducerFactory<String, String>,
openTelemetry: OpenTelemetry
): KafkaTemplate<String, String> {
val template = KafkaTemplate(producerFactory)
template.setProducerInterceptor { record ->
val context = Context.current()
openTelemetry.propagators.textMapPropagator.inject(
context,
record.headers()
) { headers, key, value ->
headers?.add(key, value.toByteArray())
}
record
}
return template
}
}
@Component
class OrderEventConsumer(
private val tracer: Tracer,
private val openTelemetry: OpenTelemetry
) {
@KafkaListener(topics = ["order-events"])
fun handleOrderEvent(
@Payload payload: String,
@Headers headers: MessageHeaders
) {
// Extract parent context
val parentContext = openTelemetry.propagators.textMapPropagator.extract(
Context.current(),
headers
) { carrier, key ->
carrier?.get(key)?.toString()
}
val span = tracer.spanBuilder("processOrderEvent")
.setParent(parentContext)
.setSpanKind(SpanKind.CONSUMER)
.startSpan()
try {
span.makeCurrent().use {
// Process event
processEvent(payload)
}
} finally {
span.end()
}
}
}
Local Development Environment Setup
Docker Compose for Observability Stack
version: '3.8'
services:
# Jaeger - Distributed Tracing
jaeger:
image: jaegertracing/all-in-one:1.53
ports:
- "16686:16686" # UI
- "4317:4317" # OTLP gRPC
- "4318:4318" # OTLP HTTP
environment:
- COLLECTOR_OTLP_ENABLED=true
# OpenTelemetry Collector (optional)
otel-collector:
image: otel/opentelemetry-collector-contrib:0.92.0
command: ["--config=/etc/otel-collector-config.yaml"]
volumes:
- ./otel-collector-config.yaml:/etc/otel-collector-config.yaml
ports:
- "4317:4317"
- "4318:4318"
depends_on:
- jaeger
OTel Collector Configuration
# otel-collector-config.yaml
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
processors:
batch:
timeout: 1s
send_batch_size: 1024
exporters:
otlp/jaeger:
endpoint: jaeger:4317
tls:
insecure: true
logging:
loglevel: debug
service:
pipelines:
traces:
receivers: [otlp]
processors: [batch]
exporters: [otlp/jaeger, logging]
Summary
Key points of OpenTelemetry:
| Item | Description |
|---|---|
| Auto-instrumentation | Instrument without code changes using Java Agent |
| Manual instrumentation | Detailed tracking of business logic |
| Context propagation | Connect traces across services |
| Vendor neutral | Send data to various backends |
In the next post, we’ll cover distributed tracing across multiple microservices.
댓글남기기