hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Yang (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDDS-1648) Reduce Ozone docker image bloat
Date Wed, 05 Jun 2019 18:02:00 GMT
Eric Yang created HDDS-1648:
-------------------------------

             Summary: Reduce Ozone docker image bloat
                 Key: HDDS-1648
                 URL: https://issues.apache.org/jira/browse/HDDS-1648
             Project: Hadoop Distributed Data Store
          Issue Type: Sub-task
            Reporter: Eric Yang


Docker image can be more lean if multiple steps are group together and run by a shell script.
 For example, all the install commands can be wrapped by a setup shell script for Hadoop-runner.

{code}
#!/bin/bash

rpm -Uvh https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
yum install -y sudo python2-pip wget nmap-ncat jq java-11-openjdk
pip install robotframework
wget -O /usr/local/bin/dumb-init https://github.com/Yelp/dumb-init/releases/download/v1.2.0/dumb-init_1.2.0_amd64
chmod +x /usr/local/bin/dumb-init
mkdir -p /etc/security/keytabs && chmod -R a+wr /etc/security/keytabs 
wget -O /opt/byteman.jar https://repo.maven.apache.org/maven2/org/jboss/byteman/byteman/4.0.4/byteman-4.0.4.jar
chmod o+r /opt/byteman.jar
mkdir -p /opt/profiler && \
    cd /opt/profiler && \
    curl -L https://github.com/jvm-profiling-tools/async-profiler/releases/download/v1.5/async-profiler-1.5-linux-x64.tar.gz
| tar xvz
yum install -y krb5-workstation
mkdir -p /etc/hadoop && mkdir -p /var/log/hadoop && chmod 1777 /etc/hadoop
&& chmod 1777 /var/log/hadoop
{code}

And Dockerfile is simplified to:
{code}
FROM centos
ADD setup.sh /
RUN /setup.sh
ADD scripts /opt/
ADD scripts/krb5.conf /etc/
WORKDIR /opt/hadoop
ENV HADOOP_LOG_DIR=/var/log/hadoop
ENV HADOOP_CONF_DIR=/etc/hadoop
ENTRYPOINT ["/usr/local/bin/dumb-init", "--", "/opt/starter.sh"]
{code}

This arrangement can drastically improve the rebuild performance of Docker image.  The end
result of the image is 150MB less than current hadoop-runner image on Github.  The reduced
intermediate layers shrinks the reference count number to improve space usage.

We can also have two scripts, one for install binaries, and another one for configure the
image.  This can even further reduce the build time, if the third party binaries rarely changes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org


Mime
View raw message