2 분 소요


The content of this post references the Apache Flink book <Stream Processing with Apache Flink: Fundamentals, Implementation, and Operation of Streaming Applications> and was written after setting up the development environment myself.

I’m going to work based on Docker. I’ll proceed assuming you know how to use Docker or have it installed.

Flink is the next-generation big data analysis framework after Spark and can be the optimal choice when you need to process stream data with low latency while requiring powerful state management.

This time, I ran Apache Flink using Docker, which is relatively easy to install and set up.

  • Master Process

    -> Job Manager

    -> Flink’s master node. A node that manages Workers executing Tasks within Flink.

    -> Manages execution timing by scheduling Tasks and monitors execution status.

    -> Responsible for Recovery when problems occur.

  • Worker Process

    -> Task Manager.

    -> Processes Tasks assigned by the Master Process.

    -> Responsible for executing the written Job App (JAR file).

I ran Apache Flink on Docker. As explained above, let’s run 1 Master Process, and run Worker Processes according to your preferences and desired configuration. (If you have many jobs to run, you can increase the workers.)

I use Mac M1. So I’ve organized commands for both Intel and M1. (M1 was still hard to find..) If you’re on Windows, you can follow Intel, and for Linux, you can use amd64. (Image tags may differ slightly.)

Intel Mac

  • Master Process (Job Manager)

      docker run -d --name flink-jobmanager -e JOB_MANAGER_RPC_ADDRESS=jobmanager -p 8081:8081 flink jobmanager
    
  • Worker Process (Task Manager)

      # task manager 1
      docker run -d --name flink-taskmanager-1 --link flink-jobmanager:jobmanager -e JOB_MANAGER_RPC_ADDRESS=jobmanager flink taskmanager
    
      # task manager 2
      docker run -d --name flink-taskmanager-2 --link flink-jobmanager:jobmanager -e JOB_MANAGER_RPC_ADDRESS=jobmanager flink taskmanager
    

M1 Mac

  • Master Process (Job Manager)

      docker run -d --name flink-jobmanager -e JOB_MANAGER_RPC_ADDRESS=jobmanager -p 8081:8081 arm64v8/flink jobmanager
    
  • Worker Process (Task Manager)

      # task manager 1
      docker run -d --name flink-taskmanager-1 --link flink-jobmanager:jobmanager -e JOB_MANAGER_RPC_ADDRESS=jobmanager arm64v8/flink taskmanager
    
      # task manager 2
      docker run -d --name flink-taskmanager-2 --link flink-jobmanager:jobmanager -e JOB_MANAGER_RPC_ADDRESS=jobmanager arm64v8/flink taskmanager
    

Linux

  • Master Process (Job Manager)

      docker run -d --name flink-jobmanager -e JOB_MANAGER_RPC_ADDRESS=jobmanager -p 8081:8081 amd64/flink jobmanager
    
  • Worker Process (Task Manager)

      # task manager 1
      docker run -d --name flink-taskmanager-1 --link flink-jobmanager:jobmanager -e JOB_MANAGER_RPC_ADDRESS=jobmanager amd64/flink taskmanager
    
      # task manager 2
      docker run -d --name flink-taskmanager-2 --link flink-jobmanager:jobmanager -e JOB_MANAGER_RPC_ADDRESS=jobmanager amd64/flink taskmanager
    

Result Screen

I ran up to 3 task managers. (Created flink-taskmanager 1 through 3)

I use M1 Mac, and the Flink version compatible with M1 Mac in Docker seemed to be a bit older. If you download the file from the official Flink website instead of Docker, it’s Flink-1.13.0 version, so the screen layout is slightly different.

댓글남기기