6 분 소요


시리즈 소개

이 시리즈는 프로덕션 환경에서 필요한 클라우드 네이티브 Observability 스택을 구축하는 방법을 다룹니다.

  1. Part 1: OpenTelemetry Instrumentation (현재 글)
  2. Part 2: 마이크로서비스 분산 추적
  3. Part 3: 구조화된 로깅과 Correlation ID
  4. Part 4: Prometheus/Grafana로 메트릭과 알림
  5. Part 5: Observability 데이터로 프로덕션 이슈 디버깅

마이크로서비스의 블랙박스 문제

사용자가 “결제가 안 돼요”라고 신고했다.

로그를 열어본다. API Gateway 로그는 정상. 주문 서비스도 정상. 결제 서비스… 에러 로그가 있긴 한데, 이게 그 사용자 요청인지 알 수 없다.

마이크로서비스의 치명적인 약점이다. 요청이 여러 서비스를 거치면서 추적 가능성(traceability)이 사라진다.

Observability의 세 기둥

  1. Traces: 분산 시스템을 통과하는 요청의 여정
  2. Metrics: 시스템의 수치적 측정값
  3. Logs: 시스템에서 발생하는 이벤트의 기록

OpenTelemetry는 이 세 가지를 통합하는 표준입니다.

OpenTelemetry 소개

OpenTelemetry(OTel)는 traces, metrics, logs를 수집하고 내보내는 벤더 중립적인 오픈소스 프로젝트입니다.

핵심 개념

  • Span: 작업의 단위 (예: HTTP 요청 처리, DB 쿼리)
  • Trace: 관련된 Span들의 트리 구조
  • Context: Span 간의 관계를 유지하는 메타데이터
  • Exporter: 수집된 데이터를 백엔드로 전송

Spring Boot 4.x에서 OpenTelemetry 설정

의존성 추가

// build.gradle.kts
dependencies {
    // Spring Boot Actuator
    implementation("org.springframework.boot:spring-boot-starter-actuator")

    // OpenTelemetry Spring Boot Starter (Spring Boot 3.x+)
    implementation("io.opentelemetry.instrumentation:opentelemetry-spring-boot-starter:2.1.0")

    // OTLP Exporter
    implementation("io.opentelemetry:opentelemetry-exporter-otlp:1.34.0")

    // Micrometer - OpenTelemetry Bridge
    implementation("io.micrometer:micrometer-tracing-bridge-otel:1.2.2")
}

Application 설정

# application.yml
spring:
  application:
    name: order-service

otel:
  exporter:
    otlp:
      endpoint: http://localhost:4317
      protocol: grpc
  resource:
    attributes:
      service.name: order-service
      service.version: 1.0.0
      deployment.environment: production
  instrumentation:
    spring-webmvc:
      enabled: true
    spring-webflux:
      enabled: true
    jdbc:
      enabled: true
    kafka:
      enabled: true

management:
  tracing:
    sampling:
      probability: 1.0  # 프로덕션에서는 0.1 (10%) 권장
  otlp:
    tracing:
      endpoint: http://localhost:4317

자동 계측 (Auto-Instrumentation)

OpenTelemetry Java Agent는 코드 변경 없이 자동으로 계측합니다.

Java Agent 사용

# Java Agent 다운로드
wget https://github.com/open-telemetry/opentelemetry-java-instrumentation/releases/latest/download/opentelemetry-javaagent.jar

# 실행
java -javaagent:opentelemetry-javaagent.jar \
  -Dotel.service.name=order-service \
  -Dotel.exporter.otlp.endpoint=http://localhost:4317 \
  -jar order-service.jar

자동 계측 대상

  • HTTP 클라이언트/서버 (RestTemplate, WebClient, Spring MVC)
  • 데이터베이스 (JDBC, R2DBC, JPA)
  • 메시징 (Kafka, RabbitMQ)
  • 캐시 (Redis)
  • gRPC
  • 그 외 많은 라이브러리

수동 계측 (Manual Instrumentation)

비즈니스 로직에 대한 세부적인 관측이 필요할 때 수동 계측을 사용합니다.

Tracer 설정

@Configuration
class TracingConfig {

    @Bean
    fun tracer(openTelemetry: OpenTelemetry): Tracer {
        return openTelemetry.getTracer("order-service", "1.0.0")
    }
}

수동 Span 생성

@Service
class OrderService(
    private val tracer: Tracer,
    private val orderRepository: OrderRepository,
    private val paymentClient: PaymentClient,
    private val inventoryClient: InventoryClient
) {
    fun createOrder(request: CreateOrderRequest): Order {
        // 부모 Span 생성
        val span = tracer.spanBuilder("createOrder")
            .setSpanKind(SpanKind.INTERNAL)
            .setAttribute("order.customer_id", request.customerId)
            .setAttribute("order.item_count", request.items.size.toLong())
            .startSpan()

        return try {
            span.makeCurrent().use { scope ->
                // 재고 확인 (자식 Span)
                val inventory = checkInventory(request.items)

                // 주문 생성 (자식 Span)
                val order = saveOrder(request)

                // 결제 처리 (자식 Span)
                processPayment(order)

                span.setAttribute("order.id", order.id)
                span.setStatus(StatusCode.OK)

                order
            }
        } catch (e: Exception) {
            span.setStatus(StatusCode.ERROR, e.message ?: "Unknown error")
            span.recordException(e)
            throw e
        } finally {
            span.end()
        }
    }

    private fun checkInventory(items: List<OrderItem>): InventoryResult {
        val span = tracer.spanBuilder("checkInventory")
            .setSpanKind(SpanKind.CLIENT)
            .startSpan()

        return try {
            span.makeCurrent().use {
                val result = inventoryClient.checkAvailability(items)
                span.setAttribute("inventory.available", result.isAvailable)
                result
            }
        } finally {
            span.end()
        }
    }

    private fun saveOrder(request: CreateOrderRequest): Order {
        val span = tracer.spanBuilder("saveOrder")
            .setAttribute("db.system", "postgresql")
            .setAttribute("db.operation", "INSERT")
            .startSpan()

        return try {
            span.makeCurrent().use {
                orderRepository.save(Order.create(request))
            }
        } finally {
            span.end()
        }
    }

    private fun processPayment(order: Order) {
        val span = tracer.spanBuilder("processPayment")
            .setSpanKind(SpanKind.CLIENT)
            .setAttribute("payment.amount", order.totalAmount.toDouble())
            .startSpan()

        try {
            span.makeCurrent().use {
                paymentClient.charge(order.customerId, order.totalAmount)
                span.setStatus(StatusCode.OK)
            }
        } catch (e: PaymentException) {
            span.setStatus(StatusCode.ERROR, "Payment failed")
            span.recordException(e)
            throw e
        } finally {
            span.end()
        }
    }
}

어노테이션 기반 계측

@Aspect
@Component
class TracingAspect(private val tracer: Tracer) {

    @Around("@annotation(traced)")
    fun traceMethod(joinPoint: ProceedingJoinPoint, traced: Traced): Any? {
        val methodName = joinPoint.signature.name
        val className = joinPoint.target.javaClass.simpleName

        val span = tracer.spanBuilder("$className.$methodName")
            .setSpanKind(SpanKind.INTERNAL)
            .startSpan()

        return try {
            span.makeCurrent().use {
                joinPoint.proceed()
            }
        } catch (e: Exception) {
            span.setStatus(StatusCode.ERROR, e.message ?: "Error")
            span.recordException(e)
            throw e
        } finally {
            span.end()
        }
    }
}

@Target(AnnotationTarget.FUNCTION)
@Retention(AnnotationRetention.RUNTIME)
annotation class Traced(val operationName: String = "")

사용 예:

@Service
class PaymentService {

    @Traced
    fun processRefund(orderId: String, amount: BigDecimal) {
        // 자동으로 Span 생성됨
    }
}

컨텍스트 전파

HTTP 헤더를 통한 전파

@Configuration
class RestTemplateConfig {

    @Bean
    fun restTemplate(openTelemetry: OpenTelemetry): RestTemplate {
        val restTemplate = RestTemplate()

        // Context Propagator 설정
        restTemplate.interceptors.add { request, body, execution ->
            val context = Context.current()
            openTelemetry.propagators.textMapPropagator.inject(
                context,
                request.headers
            ) { carrier, key, value ->
                carrier?.set(key, value)
            }
            execution.execute(request, body)
        }

        return restTemplate
    }
}

Kafka를 통한 전파

@Configuration
class KafkaTracingConfig {

    @Bean
    fun kafkaTemplate(
        producerFactory: ProducerFactory<String, String>,
        openTelemetry: OpenTelemetry
    ): KafkaTemplate<String, String> {
        val template = KafkaTemplate(producerFactory)

        template.setProducerInterceptor { record ->
            val context = Context.current()
            openTelemetry.propagators.textMapPropagator.inject(
                context,
                record.headers()
            ) { headers, key, value ->
                headers?.add(key, value.toByteArray())
            }
            record
        }

        return template
    }
}

@Component
class OrderEventConsumer(
    private val tracer: Tracer,
    private val openTelemetry: OpenTelemetry
) {
    @KafkaListener(topics = ["order-events"])
    fun handleOrderEvent(
        @Payload payload: String,
        @Headers headers: MessageHeaders
    ) {
        // 부모 컨텍스트 추출
        val parentContext = openTelemetry.propagators.textMapPropagator.extract(
            Context.current(),
            headers
        ) { carrier, key ->
            carrier?.get(key)?.toString()
        }

        val span = tracer.spanBuilder("processOrderEvent")
            .setParent(parentContext)
            .setSpanKind(SpanKind.CONSUMER)
            .startSpan()

        try {
            span.makeCurrent().use {
                // 이벤트 처리
                processEvent(payload)
            }
        } finally {
            span.end()
        }
    }
}

로컬 개발 환경 설정

Docker Compose로 Observability 스택 구성

version: '3.8'
services:
  # Jaeger - 분산 추적
  jaeger:
    image: jaegertracing/all-in-one:1.53
    ports:
      - "16686:16686"  # UI
      - "4317:4317"    # OTLP gRPC
      - "4318:4318"    # OTLP HTTP
    environment:
      - COLLECTOR_OTLP_ENABLED=true

  # OpenTelemetry Collector (선택)
  otel-collector:
    image: otel/opentelemetry-collector-contrib:0.92.0
    command: ["--config=/etc/otel-collector-config.yaml"]
    volumes:
      - ./otel-collector-config.yaml:/etc/otel-collector-config.yaml
    ports:
      - "4317:4317"
      - "4318:4318"
    depends_on:
      - jaeger

OTel Collector 설정

# otel-collector-config.yaml
receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318

processors:
  batch:
    timeout: 1s
    send_batch_size: 1024

exporters:
  otlp/jaeger:
    endpoint: jaeger:4317
    tls:
      insecure: true
  logging:
    loglevel: debug

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [batch]
      exporters: [otlp/jaeger, logging]

정리

OpenTelemetry의 핵심 포인트:

항목 설명
자동 계측 Java Agent로 코드 수정 없이 계측
수동 계측 비즈니스 로직 상세 추적
컨텍스트 전파 서비스 간 추적 연결
벤더 중립 다양한 백엔드로 데이터 전송 가능

다음 글에서는 여러 마이크로서비스에 걸친 분산 추적을 다루겠습니다.

Series Introduction

This series covers how to build a cloud-native Observability stack needed in production environments.

  1. Part 1: OpenTelemetry Instrumentation (Current)
  2. Part 2: Distributed Tracing Across Microservices
  3. Part 3: Structured Logging with Correlation IDs
  4. Part 4: Metrics and Alerting with Prometheus/Grafana
  5. Part 5: Debugging Production Issues with Observability Data

The Black Box Problem in Microservices

A user reports “Payment isn’t working.”

You check the logs. API Gateway logs are fine. Order service is fine. Payment service… there’s an error log, but you can’t tell if it’s from that user’s request.

This is a critical weakness of microservices. As requests pass through multiple services, traceability disappears.

The Three Pillars of Observability

  1. Traces: The journey of a request through a distributed system
  2. Metrics: Numerical measurements of the system
  3. Logs: Records of events occurring in the system

OpenTelemetry is the standard that unifies all three.

Introduction to OpenTelemetry

OpenTelemetry (OTel) is a vendor-neutral open-source project for collecting and exporting traces, metrics, and logs.

Core Concepts

  • Span: A unit of work (e.g., HTTP request handling, DB query)
  • Trace: A tree structure of related Spans
  • Context: Metadata that maintains relationships between Spans
  • Exporter: Sends collected data to backends

Setting Up OpenTelemetry in Spring Boot 4.x

Adding Dependencies

// build.gradle.kts
dependencies {
    // Spring Boot Actuator
    implementation("org.springframework.boot:spring-boot-starter-actuator")

    // OpenTelemetry Spring Boot Starter (Spring Boot 3.x+)
    implementation("io.opentelemetry.instrumentation:opentelemetry-spring-boot-starter:2.1.0")

    // OTLP Exporter
    implementation("io.opentelemetry:opentelemetry-exporter-otlp:1.34.0")

    // Micrometer - OpenTelemetry Bridge
    implementation("io.micrometer:micrometer-tracing-bridge-otel:1.2.2")
}

Application Configuration

# application.yml
spring:
  application:
    name: order-service

otel:
  exporter:
    otlp:
      endpoint: http://localhost:4317
      protocol: grpc
  resource:
    attributes:
      service.name: order-service
      service.version: 1.0.0
      deployment.environment: production

management:
  tracing:
    sampling:
      probability: 1.0  # Recommend 0.1 (10%) for production

Auto-Instrumentation

The OpenTelemetry Java Agent automatically instruments without code changes.

Using the Java Agent

# Download Java Agent
wget https://github.com/open-telemetry/opentelemetry-java-instrumentation/releases/latest/download/opentelemetry-javaagent.jar

# Run
java -javaagent:opentelemetry-javaagent.jar \
  -Dotel.service.name=order-service \
  -Dotel.exporter.otlp.endpoint=http://localhost:4317 \
  -jar order-service.jar

Manual Instrumentation

Use manual instrumentation when you need detailed observation of business logic.

Manual Span Creation

@Service
class OrderService(
    private val tracer: Tracer,
    private val orderRepository: OrderRepository
) {
    fun createOrder(request: CreateOrderRequest): Order {
        val span = tracer.spanBuilder("createOrder")
            .setSpanKind(SpanKind.INTERNAL)
            .setAttribute("order.customer_id", request.customerId)
            .startSpan()

        return try {
            span.makeCurrent().use { scope ->
                val order = saveOrder(request)
                span.setAttribute("order.id", order.id)
                span.setStatus(StatusCode.OK)
                order
            }
        } catch (e: Exception) {
            span.setStatus(StatusCode.ERROR, e.message ?: "Unknown error")
            span.recordException(e)
            throw e
        } finally {
            span.end()
        }
    }
}

Context Propagation

Propagation via HTTP Headers

@Configuration
class RestTemplateConfig {

    @Bean
    fun restTemplate(openTelemetry: OpenTelemetry): RestTemplate {
        val restTemplate = RestTemplate()

        restTemplate.interceptors.add { request, body, execution ->
            val context = Context.current()
            openTelemetry.propagators.textMapPropagator.inject(
                context,
                request.headers
            ) { carrier, key, value ->
                carrier?.set(key, value)
            }
            execution.execute(request, body)
        }

        return restTemplate
    }
}

Summary

Key points of OpenTelemetry:

Item Description
Auto-instrumentation Instrument without code changes using Java Agent
Manual instrumentation Detailed tracking of business logic
Context propagation Connect traces across services
Vendor neutral Send data to various backends

In the next post, we’ll cover distributed tracing across multiple microservices.

댓글남기기