Published on

Secured(Kerberized) YARN 구축하기.

Authors
  • Name
    Twitter

Overview

YARN에 kerberos를 적용하는 법을 기록해 둔다.

configurations

yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>
    <property>
        <name>yarn.resourcemanager.ha.enabled</name>
        <value>true</value>
    </property>
    <property>
        <name>yarn.resourcemanager.cluster-id</name>
        <value>test-cluster</value>
    </property>
    <property>
        <name>yarn.resourcemanager.ha.rm-ids</name>
        <value>rm1,rm2</value>
    </property>
    <property>
        <name>yarn.resourcemanager.hostname.rm1</name>
        <value>hadoop1.mysite.com</value>
    </property>
    <property>
        <name>yarn.resourcemanager.hostname.rm2</name>
        <value>hadoop2.mysite.com</value>
    </property>
    <property>
        <name>yarn.resourcemanager.resource-tracker.address.rm1</name>
        <value>hadoop1.mysite.com:8025</value>
    </property>
    <property>
        <name>yarn.resourcemanager.scheduler.address.rm1</name>
        <value>hadoop1.mysite.com:8030</value>
    </property>
    <property>
        <name>yarn.resourcemanager.address.rm1</name>
        <value>hadoop1.mysite.com:8050</value>
    </property>
    <property>
        <name>yarn.resourcemanager.webapp.address.rm1</name>
        <value>hadoop1.mysite.com:8055</value>
    </property>
    <property>
        <name>yarn.resourcemanager.resource-tracker.address.rm2</name>
        <value>hadoop2.mysite.com:8025</value>
    </property>
    <property>
        <name>yarn.resourcemanager.scheduler.address.rm2</name>
        <value>hadoop2.mysite.com:8030</value>
    </property>
    <property>
        <name>yarn.resourcemanager.address.rm2</name>
        <value>hadoop2.mysite.com:8050</value>
    </property>
    <property>
        <name>yarn.resourcemanager.webapp.address.rm2</name>
        <value>hadoop2.mysite.com:8055</value>
    </property>
    <property>
        <name>hadoop.zk.address</name>
        <value>hadoop1.mysite.com:2181,hadoop2.mysite.com:2181,hadoop3.mysite.com:2181</value>
    </property>
    <property>
        <name>yarn.resourcemanager.scheduler.class</name>
        <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler</value>
    </property>
<!-- ResourceManager 보안설정 -->
    <property>
        <name>yarn.resourcemanager.keytab</name>
        <value>/etc/hdfs/conf/hdfs.keytab</value>
    </property>
    <property>
        <name>yarn.resourcemanager.principal</name>
        <value>yarn/_HOST@CHAOS.ORDER.COM</value>
    </property>
<!-- NodeManager 보안설정 -->
    <property>
        <name>yarn.nodemanager.keytab</name>
        <value>/etc/hdfs/conf/hdfs.keytab</value>
  </property>
      <property>
        <name>yarn.nodemanager.principal</name>
        <value>yarn/_HOST@CHAOS.ORDER.COM</value>
    </property>
    <property>
        <name>yarn.nodemanager.container-executor.class</name>
        <value>org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor</value>
    </property>
    <property>
        <name>yarn.nodemanager.container-executor.class</name>
        <value>org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor</value>
    </property>
    <property>
        <name>yarn.nodemanager.linux-container-executor.group</name>
        <value>hadoop</value>
    </property>
</configuration>
mapred-site.xml

<configuration>
   <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
    <property>
        <name>mapreduce.jobhistory.keytab</name>
        <value>hdfs</value>
    </property>
    <property>
         <name>mapreduce.jobhistory.principal</name>
         <value>mapred/_HOST@CHAOS.ORDER.COM</value>
    </property>
    <property>
        <name>mapreduce.tasktracker.http.threads</name>
        <value>400</value>
    </property>
    <property>
        <name>mapreduce.jobhistory.webapp.spnego-principal</name>
        <value>HTTP/_HOST@CHAOS.ORDER.COM</value>
    </property>
    <property>
        <name>mapreduce.jobhistory.webapp.spnego-keytab-file</name>
        <value>/etc/hdfs/conf/hdfs.keytab</value>
    </property>
<property>
    <name>mapred.child.java.opts</name>
    <value>-Xmx2048m</value>
</property>
</configuration>
yarn.nodemanager.linux-container-executor.group=hadoop
banned.users=zookeeper
min.user.id=1000
allowed.system.users=hbase

위와 같이 파일을 설정하면, YARN 컴포넌트에 대해서도 secure 하게 사용가능하다. 대신, ACL 관리는 다소 어려워지는데 새로운 유저가 mapreduce job을 수행하도록 하려면 모든 hadoop 리눅스 서버에 useradd 를 통해 계정을 생성해주고, hdfs 의 User 디렉터리에 User 이름의 디렉터리를 해당 유저 권한으로 생성해주어야 한다.