4 분 소요


Series Introduction

This series covers how to build a cloud-native Observability stack needed in production environments.

  1. Part 1: OpenTelemetry Instrumentation (Current)
  2. Part 2: Distributed Tracing Across Microservices
  3. Part 3: Structured Logging with Correlation IDs
  4. Part 4: Metrics and Alerting with Prometheus/Grafana
  5. Part 5: Debugging Production Issues with Observability Data

The Black Box Problem in Microservices

A user reports “Payment isn’t working.”

You check the logs. API Gateway logs are fine. Order service is fine. Payment service… there’s an error log, but you can’t tell if it’s from that user’s request.

This is a critical weakness of microservices. As requests pass through multiple services, traceability disappears.

The Three Pillars of Observability

  1. Traces: The journey of a request through a distributed system
  2. Metrics: Numerical measurements of the system
  3. Logs: Records of events occurring in the system

OpenTelemetry is the standard that unifies all three.

Introduction to OpenTelemetry

OpenTelemetry (OTel) is a vendor-neutral open-source project for collecting and exporting traces, metrics, and logs.

Core Concepts

  • Span: A unit of work (e.g., HTTP request handling, DB query)
  • Trace: A tree structure of related Spans
  • Context: Metadata that maintains relationships between Spans
  • Exporter: Sends collected data to backends

Setting Up OpenTelemetry in Spring Boot 4.x

Adding Dependencies

// build.gradle.kts
dependencies {
    // Spring Boot Actuator
    implementation("org.springframework.boot:spring-boot-starter-actuator")

    // OpenTelemetry Spring Boot Starter (Spring Boot 3.x+)
    implementation("io.opentelemetry.instrumentation:opentelemetry-spring-boot-starter:2.1.0")

    // OTLP Exporter
    implementation("io.opentelemetry:opentelemetry-exporter-otlp:1.34.0")

    // Micrometer - OpenTelemetry Bridge
    implementation("io.micrometer:micrometer-tracing-bridge-otel:1.2.2")
}

Application Configuration

# application.yml
spring:
  application:
    name: order-service

otel:
  exporter:
    otlp:
      endpoint: http://localhost:4317
      protocol: grpc
  resource:
    attributes:
      service.name: order-service
      service.version: 1.0.0
      deployment.environment: production
  instrumentation:
    spring-webmvc:
      enabled: true
    spring-webflux:
      enabled: true
    jdbc:
      enabled: true
    kafka:
      enabled: true

management:
  tracing:
    sampling:
      probability: 1.0  # Recommend 0.1 (10%) for production
  otlp:
    tracing:
      endpoint: http://localhost:4317

Auto-Instrumentation

The OpenTelemetry Java Agent automatically instruments your application without code changes.

Using the Java Agent

# Download Java Agent
wget https://github.com/open-telemetry/opentelemetry-java-instrumentation/releases/latest/download/opentelemetry-javaagent.jar

# Run
java -javaagent:opentelemetry-javaagent.jar \
  -Dotel.service.name=order-service \
  -Dotel.exporter.otlp.endpoint=http://localhost:4317 \
  -jar order-service.jar

Auto-Instrumentation Targets

  • HTTP clients/servers (RestTemplate, WebClient, Spring MVC)
  • Databases (JDBC, R2DBC, JPA)
  • Messaging (Kafka, RabbitMQ)
  • Cache (Redis)
  • gRPC
  • Many other libraries

Manual Instrumentation

Use manual instrumentation when you need detailed observation of business logic.

Tracer Configuration

@Configuration
class TracingConfig {

    @Bean
    fun tracer(openTelemetry: OpenTelemetry): Tracer {
        return openTelemetry.getTracer("order-service", "1.0.0")
    }
}

Manual Span Creation

@Service
class OrderService(
    private val tracer: Tracer,
    private val orderRepository: OrderRepository,
    private val paymentClient: PaymentClient,
    private val inventoryClient: InventoryClient
) {
    fun createOrder(request: CreateOrderRequest): Order {
        // Create parent Span
        val span = tracer.spanBuilder("createOrder")
            .setSpanKind(SpanKind.INTERNAL)
            .setAttribute("order.customer_id", request.customerId)
            .setAttribute("order.item_count", request.items.size.toLong())
            .startSpan()

        return try {
            span.makeCurrent().use { scope ->
                // Check inventory (child Span)
                val inventory = checkInventory(request.items)

                // Create order (child Span)
                val order = saveOrder(request)

                // Process payment (child Span)
                processPayment(order)

                span.setAttribute("order.id", order.id)
                span.setStatus(StatusCode.OK)

                order
            }
        } catch (e: Exception) {
            span.setStatus(StatusCode.ERROR, e.message ?: "Unknown error")
            span.recordException(e)
            throw e
        } finally {
            span.end()
        }
    }

    private fun checkInventory(items: List<OrderItem>): InventoryResult {
        val span = tracer.spanBuilder("checkInventory")
            .setSpanKind(SpanKind.CLIENT)
            .startSpan()

        return try {
            span.makeCurrent().use {
                val result = inventoryClient.checkAvailability(items)
                span.setAttribute("inventory.available", result.isAvailable)
                result
            }
        } finally {
            span.end()
        }
    }

    private fun saveOrder(request: CreateOrderRequest): Order {
        val span = tracer.spanBuilder("saveOrder")
            .setAttribute("db.system", "postgresql")
            .setAttribute("db.operation", "INSERT")
            .startSpan()

        return try {
            span.makeCurrent().use {
                orderRepository.save(Order.create(request))
            }
        } finally {
            span.end()
        }
    }

    private fun processPayment(order: Order) {
        val span = tracer.spanBuilder("processPayment")
            .setSpanKind(SpanKind.CLIENT)
            .setAttribute("payment.amount", order.totalAmount.toDouble())
            .startSpan()

        try {
            span.makeCurrent().use {
                paymentClient.charge(order.customerId, order.totalAmount)
                span.setStatus(StatusCode.OK)
            }
        } catch (e: PaymentException) {
            span.setStatus(StatusCode.ERROR, "Payment failed")
            span.recordException(e)
            throw e
        } finally {
            span.end()
        }
    }
}

Annotation-Based Instrumentation

@Aspect
@Component
class TracingAspect(private val tracer: Tracer) {

    @Around("@annotation(traced)")
    fun traceMethod(joinPoint: ProceedingJoinPoint, traced: Traced): Any? {
        val methodName = joinPoint.signature.name
        val className = joinPoint.target.javaClass.simpleName

        val span = tracer.spanBuilder("$className.$methodName")
            .setSpanKind(SpanKind.INTERNAL)
            .startSpan()

        return try {
            span.makeCurrent().use {
                joinPoint.proceed()
            }
        } catch (e: Exception) {
            span.setStatus(StatusCode.ERROR, e.message ?: "Error")
            span.recordException(e)
            throw e
        } finally {
            span.end()
        }
    }
}

@Target(AnnotationTarget.FUNCTION)
@Retention(AnnotationRetention.RUNTIME)
annotation class Traced(val operationName: String = "")

Usage example:

@Service
class PaymentService {

    @Traced
    fun processRefund(orderId: String, amount: BigDecimal) {
        // Span is automatically created
    }
}

Context Propagation

Propagation via HTTP Headers

@Configuration
class RestTemplateConfig {

    @Bean
    fun restTemplate(openTelemetry: OpenTelemetry): RestTemplate {
        val restTemplate = RestTemplate()

        // Configure Context Propagator
        restTemplate.interceptors.add { request, body, execution ->
            val context = Context.current()
            openTelemetry.propagators.textMapPropagator.inject(
                context,
                request.headers
            ) { carrier, key, value ->
                carrier?.set(key, value)
            }
            execution.execute(request, body)
        }

        return restTemplate
    }
}

Propagation via Kafka

@Configuration
class KafkaTracingConfig {

    @Bean
    fun kafkaTemplate(
        producerFactory: ProducerFactory<String, String>,
        openTelemetry: OpenTelemetry
    ): KafkaTemplate<String, String> {
        val template = KafkaTemplate(producerFactory)

        template.setProducerInterceptor { record ->
            val context = Context.current()
            openTelemetry.propagators.textMapPropagator.inject(
                context,
                record.headers()
            ) { headers, key, value ->
                headers?.add(key, value.toByteArray())
            }
            record
        }

        return template
    }
}

@Component
class OrderEventConsumer(
    private val tracer: Tracer,
    private val openTelemetry: OpenTelemetry
) {
    @KafkaListener(topics = ["order-events"])
    fun handleOrderEvent(
        @Payload payload: String,
        @Headers headers: MessageHeaders
    ) {
        // Extract parent context
        val parentContext = openTelemetry.propagators.textMapPropagator.extract(
            Context.current(),
            headers
        ) { carrier, key ->
            carrier?.get(key)?.toString()
        }

        val span = tracer.spanBuilder("processOrderEvent")
            .setParent(parentContext)
            .setSpanKind(SpanKind.CONSUMER)
            .startSpan()

        try {
            span.makeCurrent().use {
                // Process event
                processEvent(payload)
            }
        } finally {
            span.end()
        }
    }
}

Local Development Environment Setup

Docker Compose for Observability Stack

version: '3.8'
services:
  # Jaeger - Distributed Tracing
  jaeger:
    image: jaegertracing/all-in-one:1.53
    ports:
      - "16686:16686"  # UI
      - "4317:4317"    # OTLP gRPC
      - "4318:4318"    # OTLP HTTP
    environment:
      - COLLECTOR_OTLP_ENABLED=true

  # OpenTelemetry Collector (optional)
  otel-collector:
    image: otel/opentelemetry-collector-contrib:0.92.0
    command: ["--config=/etc/otel-collector-config.yaml"]
    volumes:
      - ./otel-collector-config.yaml:/etc/otel-collector-config.yaml
    ports:
      - "4317:4317"
      - "4318:4318"
    depends_on:
      - jaeger

OTel Collector Configuration

# otel-collector-config.yaml
receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318

processors:
  batch:
    timeout: 1s
    send_batch_size: 1024

exporters:
  otlp/jaeger:
    endpoint: jaeger:4317
    tls:
      insecure: true
  logging:
    loglevel: debug

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [batch]
      exporters: [otlp/jaeger, logging]

Summary

Key points of OpenTelemetry:

Item Description
Auto-instrumentation Instrument without code changes using Java Agent
Manual instrumentation Detailed tracking of business logic
Context propagation Connect traces across services
Vendor neutral Send data to various backends

In the next post, we’ll cover distributed tracing across multiple microservices.

댓글남기기