클라우드 네이티브 Observability Part 1 - Spring Boot에서 OpenTelemetry 시작하기 Cloud-Native Observability Stack Part 1 - OpenTelemetry Instrumentation in Spring Boot
시리즈 소개
이 시리즈는 프로덕션 환경에서 필요한 클라우드 네이티브 Observability 스택을 구축하는 방법을 다룹니다.
- Part 1: OpenTelemetry Instrumentation (현재 글)
- Part 2: 마이크로서비스 분산 추적
- Part 3: 구조화된 로깅과 Correlation ID
- Part 4: Prometheus/Grafana로 메트릭과 알림
- Part 5: Observability 데이터로 프로덕션 이슈 디버깅
마이크로서비스의 블랙박스 문제
사용자가 “결제가 안 돼요”라고 신고했다.
로그를 열어본다. API Gateway 로그는 정상. 주문 서비스도 정상. 결제 서비스… 에러 로그가 있긴 한데, 이게 그 사용자 요청인지 알 수 없다.
마이크로서비스의 치명적인 약점이다. 요청이 여러 서비스를 거치면서 추적 가능성(traceability)이 사라진다.
Observability의 세 기둥
- Traces: 분산 시스템을 통과하는 요청의 여정
- Metrics: 시스템의 수치적 측정값
- Logs: 시스템에서 발생하는 이벤트의 기록
OpenTelemetry는 이 세 가지를 통합하는 표준입니다.
OpenTelemetry 소개
OpenTelemetry(OTel)는 traces, metrics, logs를 수집하고 내보내는 벤더 중립적인 오픈소스 프로젝트입니다.
핵심 개념
- Span: 작업의 단위 (예: HTTP 요청 처리, DB 쿼리)
- Trace: 관련된 Span들의 트리 구조
- Context: Span 간의 관계를 유지하는 메타데이터
- Exporter: 수집된 데이터를 백엔드로 전송
Spring Boot 4.x에서 OpenTelemetry 설정
의존성 추가
// build.gradle.kts
dependencies {
// Spring Boot Actuator
implementation("org.springframework.boot:spring-boot-starter-actuator")
// OpenTelemetry Spring Boot Starter (Spring Boot 3.x+)
implementation("io.opentelemetry.instrumentation:opentelemetry-spring-boot-starter:2.1.0")
// OTLP Exporter
implementation("io.opentelemetry:opentelemetry-exporter-otlp:1.34.0")
// Micrometer - OpenTelemetry Bridge
implementation("io.micrometer:micrometer-tracing-bridge-otel:1.2.2")
}
Application 설정
# application.yml
spring:
application:
name: order-service
otel:
exporter:
otlp:
endpoint: http://localhost:4317
protocol: grpc
resource:
attributes:
service.name: order-service
service.version: 1.0.0
deployment.environment: production
instrumentation:
spring-webmvc:
enabled: true
spring-webflux:
enabled: true
jdbc:
enabled: true
kafka:
enabled: true
management:
tracing:
sampling:
probability: 1.0 # 프로덕션에서는 0.1 (10%) 권장
otlp:
tracing:
endpoint: http://localhost:4317
자동 계측 (Auto-Instrumentation)
OpenTelemetry Java Agent는 코드 변경 없이 자동으로 계측합니다.
Java Agent 사용
# Java Agent 다운로드
wget https://github.com/open-telemetry/opentelemetry-java-instrumentation/releases/latest/download/opentelemetry-javaagent.jar
# 실행
java -javaagent:opentelemetry-javaagent.jar \
-Dotel.service.name=order-service \
-Dotel.exporter.otlp.endpoint=http://localhost:4317 \
-jar order-service.jar
자동 계측 대상
- HTTP 클라이언트/서버 (RestTemplate, WebClient, Spring MVC)
- 데이터베이스 (JDBC, R2DBC, JPA)
- 메시징 (Kafka, RabbitMQ)
- 캐시 (Redis)
- gRPC
- 그 외 많은 라이브러리
수동 계측 (Manual Instrumentation)
비즈니스 로직에 대한 세부적인 관측이 필요할 때 수동 계측을 사용합니다.
Tracer 설정
@Configuration
class TracingConfig {
@Bean
fun tracer(openTelemetry: OpenTelemetry): Tracer {
return openTelemetry.getTracer("order-service", "1.0.0")
}
}
수동 Span 생성
@Service
class OrderService(
private val tracer: Tracer,
private val orderRepository: OrderRepository,
private val paymentClient: PaymentClient,
private val inventoryClient: InventoryClient
) {
fun createOrder(request: CreateOrderRequest): Order {
// 부모 Span 생성
val span = tracer.spanBuilder("createOrder")
.setSpanKind(SpanKind.INTERNAL)
.setAttribute("order.customer_id", request.customerId)
.setAttribute("order.item_count", request.items.size.toLong())
.startSpan()
return try {
span.makeCurrent().use { scope ->
// 재고 확인 (자식 Span)
val inventory = checkInventory(request.items)
// 주문 생성 (자식 Span)
val order = saveOrder(request)
// 결제 처리 (자식 Span)
processPayment(order)
span.setAttribute("order.id", order.id)
span.setStatus(StatusCode.OK)
order
}
} catch (e: Exception) {
span.setStatus(StatusCode.ERROR, e.message ?: "Unknown error")
span.recordException(e)
throw e
} finally {
span.end()
}
}
private fun checkInventory(items: List<OrderItem>): InventoryResult {
val span = tracer.spanBuilder("checkInventory")
.setSpanKind(SpanKind.CLIENT)
.startSpan()
return try {
span.makeCurrent().use {
val result = inventoryClient.checkAvailability(items)
span.setAttribute("inventory.available", result.isAvailable)
result
}
} finally {
span.end()
}
}
private fun saveOrder(request: CreateOrderRequest): Order {
val span = tracer.spanBuilder("saveOrder")
.setAttribute("db.system", "postgresql")
.setAttribute("db.operation", "INSERT")
.startSpan()
return try {
span.makeCurrent().use {
orderRepository.save(Order.create(request))
}
} finally {
span.end()
}
}
private fun processPayment(order: Order) {
val span = tracer.spanBuilder("processPayment")
.setSpanKind(SpanKind.CLIENT)
.setAttribute("payment.amount", order.totalAmount.toDouble())
.startSpan()
try {
span.makeCurrent().use {
paymentClient.charge(order.customerId, order.totalAmount)
span.setStatus(StatusCode.OK)
}
} catch (e: PaymentException) {
span.setStatus(StatusCode.ERROR, "Payment failed")
span.recordException(e)
throw e
} finally {
span.end()
}
}
}
어노테이션 기반 계측
@Aspect
@Component
class TracingAspect(private val tracer: Tracer) {
@Around("@annotation(traced)")
fun traceMethod(joinPoint: ProceedingJoinPoint, traced: Traced): Any? {
val methodName = joinPoint.signature.name
val className = joinPoint.target.javaClass.simpleName
val span = tracer.spanBuilder("$className.$methodName")
.setSpanKind(SpanKind.INTERNAL)
.startSpan()
return try {
span.makeCurrent().use {
joinPoint.proceed()
}
} catch (e: Exception) {
span.setStatus(StatusCode.ERROR, e.message ?: "Error")
span.recordException(e)
throw e
} finally {
span.end()
}
}
}
@Target(AnnotationTarget.FUNCTION)
@Retention(AnnotationRetention.RUNTIME)
annotation class Traced(val operationName: String = "")
사용 예:
@Service
class PaymentService {
@Traced
fun processRefund(orderId: String, amount: BigDecimal) {
// 자동으로 Span 생성됨
}
}
컨텍스트 전파
HTTP 헤더를 통한 전파
@Configuration
class RestTemplateConfig {
@Bean
fun restTemplate(openTelemetry: OpenTelemetry): RestTemplate {
val restTemplate = RestTemplate()
// Context Propagator 설정
restTemplate.interceptors.add { request, body, execution ->
val context = Context.current()
openTelemetry.propagators.textMapPropagator.inject(
context,
request.headers
) { carrier, key, value ->
carrier?.set(key, value)
}
execution.execute(request, body)
}
return restTemplate
}
}
Kafka를 통한 전파
@Configuration
class KafkaTracingConfig {
@Bean
fun kafkaTemplate(
producerFactory: ProducerFactory<String, String>,
openTelemetry: OpenTelemetry
): KafkaTemplate<String, String> {
val template = KafkaTemplate(producerFactory)
template.setProducerInterceptor { record ->
val context = Context.current()
openTelemetry.propagators.textMapPropagator.inject(
context,
record.headers()
) { headers, key, value ->
headers?.add(key, value.toByteArray())
}
record
}
return template
}
}
@Component
class OrderEventConsumer(
private val tracer: Tracer,
private val openTelemetry: OpenTelemetry
) {
@KafkaListener(topics = ["order-events"])
fun handleOrderEvent(
@Payload payload: String,
@Headers headers: MessageHeaders
) {
// 부모 컨텍스트 추출
val parentContext = openTelemetry.propagators.textMapPropagator.extract(
Context.current(),
headers
) { carrier, key ->
carrier?.get(key)?.toString()
}
val span = tracer.spanBuilder("processOrderEvent")
.setParent(parentContext)
.setSpanKind(SpanKind.CONSUMER)
.startSpan()
try {
span.makeCurrent().use {
// 이벤트 처리
processEvent(payload)
}
} finally {
span.end()
}
}
}
로컬 개발 환경 설정
Docker Compose로 Observability 스택 구성
version: '3.8'
services:
# Jaeger - 분산 추적
jaeger:
image: jaegertracing/all-in-one:1.53
ports:
- "16686:16686" # UI
- "4317:4317" # OTLP gRPC
- "4318:4318" # OTLP HTTP
environment:
- COLLECTOR_OTLP_ENABLED=true
# OpenTelemetry Collector (선택)
otel-collector:
image: otel/opentelemetry-collector-contrib:0.92.0
command: ["--config=/etc/otel-collector-config.yaml"]
volumes:
- ./otel-collector-config.yaml:/etc/otel-collector-config.yaml
ports:
- "4317:4317"
- "4318:4318"
depends_on:
- jaeger
OTel Collector 설정
# otel-collector-config.yaml
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
processors:
batch:
timeout: 1s
send_batch_size: 1024
exporters:
otlp/jaeger:
endpoint: jaeger:4317
tls:
insecure: true
logging:
loglevel: debug
service:
pipelines:
traces:
receivers: [otlp]
processors: [batch]
exporters: [otlp/jaeger, logging]
정리
OpenTelemetry의 핵심 포인트:
| 항목 | 설명 |
|---|---|
| 자동 계측 | Java Agent로 코드 수정 없이 계측 |
| 수동 계측 | 비즈니스 로직 상세 추적 |
| 컨텍스트 전파 | 서비스 간 추적 연결 |
| 벤더 중립 | 다양한 백엔드로 데이터 전송 가능 |
다음 글에서는 여러 마이크로서비스에 걸친 분산 추적을 다루겠습니다.
Series Introduction
This series covers how to build a cloud-native Observability stack needed in production environments.
- Part 1: OpenTelemetry Instrumentation (Current)
- Part 2: Distributed Tracing Across Microservices
- Part 3: Structured Logging with Correlation IDs
- Part 4: Metrics and Alerting with Prometheus/Grafana
- Part 5: Debugging Production Issues with Observability Data
The Black Box Problem in Microservices
A user reports “Payment isn’t working.”
You check the logs. API Gateway logs are fine. Order service is fine. Payment service… there’s an error log, but you can’t tell if it’s from that user’s request.
This is a critical weakness of microservices. As requests pass through multiple services, traceability disappears.
The Three Pillars of Observability
- Traces: The journey of a request through a distributed system
- Metrics: Numerical measurements of the system
- Logs: Records of events occurring in the system
OpenTelemetry is the standard that unifies all three.
Introduction to OpenTelemetry
OpenTelemetry (OTel) is a vendor-neutral open-source project for collecting and exporting traces, metrics, and logs.
Core Concepts
- Span: A unit of work (e.g., HTTP request handling, DB query)
- Trace: A tree structure of related Spans
- Context: Metadata that maintains relationships between Spans
- Exporter: Sends collected data to backends
Setting Up OpenTelemetry in Spring Boot 4.x
Adding Dependencies
// build.gradle.kts
dependencies {
// Spring Boot Actuator
implementation("org.springframework.boot:spring-boot-starter-actuator")
// OpenTelemetry Spring Boot Starter (Spring Boot 3.x+)
implementation("io.opentelemetry.instrumentation:opentelemetry-spring-boot-starter:2.1.0")
// OTLP Exporter
implementation("io.opentelemetry:opentelemetry-exporter-otlp:1.34.0")
// Micrometer - OpenTelemetry Bridge
implementation("io.micrometer:micrometer-tracing-bridge-otel:1.2.2")
}
Application Configuration
# application.yml
spring:
application:
name: order-service
otel:
exporter:
otlp:
endpoint: http://localhost:4317
protocol: grpc
resource:
attributes:
service.name: order-service
service.version: 1.0.0
deployment.environment: production
management:
tracing:
sampling:
probability: 1.0 # Recommend 0.1 (10%) for production
Auto-Instrumentation
The OpenTelemetry Java Agent automatically instruments without code changes.
Using the Java Agent
# Download Java Agent
wget https://github.com/open-telemetry/opentelemetry-java-instrumentation/releases/latest/download/opentelemetry-javaagent.jar
# Run
java -javaagent:opentelemetry-javaagent.jar \
-Dotel.service.name=order-service \
-Dotel.exporter.otlp.endpoint=http://localhost:4317 \
-jar order-service.jar
Manual Instrumentation
Use manual instrumentation when you need detailed observation of business logic.
Manual Span Creation
@Service
class OrderService(
private val tracer: Tracer,
private val orderRepository: OrderRepository
) {
fun createOrder(request: CreateOrderRequest): Order {
val span = tracer.spanBuilder("createOrder")
.setSpanKind(SpanKind.INTERNAL)
.setAttribute("order.customer_id", request.customerId)
.startSpan()
return try {
span.makeCurrent().use { scope ->
val order = saveOrder(request)
span.setAttribute("order.id", order.id)
span.setStatus(StatusCode.OK)
order
}
} catch (e: Exception) {
span.setStatus(StatusCode.ERROR, e.message ?: "Unknown error")
span.recordException(e)
throw e
} finally {
span.end()
}
}
}
Context Propagation
Propagation via HTTP Headers
@Configuration
class RestTemplateConfig {
@Bean
fun restTemplate(openTelemetry: OpenTelemetry): RestTemplate {
val restTemplate = RestTemplate()
restTemplate.interceptors.add { request, body, execution ->
val context = Context.current()
openTelemetry.propagators.textMapPropagator.inject(
context,
request.headers
) { carrier, key, value ->
carrier?.set(key, value)
}
execution.execute(request, body)
}
return restTemplate
}
}
Summary
Key points of OpenTelemetry:
| Item | Description |
|---|---|
| Auto-instrumentation | Instrument without code changes using Java Agent |
| Manual instrumentation | Detailed tracking of business logic |
| Context propagation | Connect traces across services |
| Vendor neutral | Send data to various backends |
In the next post, we’ll cover distributed tracing across multiple microservices.
댓글남기기