hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Xiao Chen <x...@cloudera.com>
Subject Re: [VOTE] Release Apache Hadoop 3.0.0-alpha2 RC0
Date Wed, 25 Jan 2017 18:53:59 GMT
Thanks Andrew and the community to work out the alpha2 RC!

+1 (non-binding)

   - Built the source tarball
   - Tested on a pseudo-distributed cluster, basic HDFS operations/sample
   pi job over HDFS encryption zone work.
   - Sanity checked NN and KMS webui
   - Sanity checked NN/DN/KMS logs.


-Xiao

On Wed, Jan 25, 2017 at 9:41 AM, Zhihai Xu <zhihaixu2012@gmail.com> wrote:

> Thanks Andrew for creating release Hadoop 3.0.0-alpha2 RC0
> +1 ( non-binding)
>
> --Downloaded source and built from it.
> --Deployed on a pseudo-distributed cluster.
> --Ran sample MR jobs and tested with basics HDFS operations.
> --Did a sanity check for RM and NM UI.
>
> Best,
> zhihai
>
> On Wed, Jan 25, 2017 at 8:07 AM, Kuhu Shukla <kshukla@yahoo-inc.com.invalid
> >
> wrote:
>
> > +1 (non-binding)* Built from source* Deployed on a pseudo-distributed
> > cluster (MAC)* Ran wordcount and sleep jobs.
> >
> >
> >     On Wednesday, January 25, 2017 3:21 AM, Marton Elek <
> > melek@hortonworks.com> wrote:
> >
> >
> >  Hi,
> >
> > I also did a quick smoketest with the provided 3.0.0-alpha2 binaries:
> >
> > TLDR; It works well
> >
> > Environment:
> >  * 5 hosts, docker based hadoop cluster, every component in separated
> > container (5 datanode/5 nodemanager/...)
> >  * Components are:
> >   * Hdfs/Yarn cluster (upgraded 2.7.3 to 3.0.0-alpha2 using the binary
> > package for vote)
> >   * Zeppelin 0.6.2/0.7.0-RC2
> >   * Spark 2.0.2/2.1.0
> >   * HBase 1.2.4 + zookeeper
> >   * + additional docker containers for configuration management and
> > monitoring
> > * No HA, no kerberos, no wire encryption
> >
> >  * HDFS cluster upgraded successfully from 2.7.3 (with about 200G data)
> >  * Imported 100G data to HBase successfully
> >  * Started Spark jobs to process 1G json from HDFS (using
> > spark-master/slave cluster). It worked even when I used the Zeppelin
> 0.6.2
> > + Spark 2.0.2 (with old hadoop client included). Obviously the old
> version
> > can't use the new Yarn cluster as the token file format has been changed.
> >  * I upgraded my setup to use Zeppelin 0.7.0-RC2/Spark 2.1.0(distribution
> > without hadoop)/hadoop 3.0.0-alpha2. It also worked well: processed the
> > same json files from HDFS with spark jobs (from zeppelin) using yarn
> > cluster (master: yarn deploy-mode: cluster)
> >  * Started spark jobs (with spark submit, master: yarn) to count records
> > from the hbase database: OK
> >  * Started example Mapreduce jobs from distribution over yarn. It was OK
> > but only with specific configuration (see bellow)
> >
> > So my overall impression that it works very well (at least with my
> > 'smalldata')
> >
> > Some notes (none of them are blocking):
> >
> > 1. To run the example mapreduce jobs I defined HADOOP_MAPRED_HOME at
> > command line:
> > ./bin/yarn jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0-
> alpha2.jar
> > pi -Dyarn.app.mapreduce.am.env="HADOOP_MAPRED_HOME={{HADOOP_
> COMMON_HOME}}"
> > -Dmapreduce.admin.user.env="HADOOP_MAPRED_HOME={{HADOOP_COMMON_HOME}}"
> 10
> > 10
> >
> > And in the yarn-site:
> >
> > yarn.nodemanager.env-whitelist: JAVA_HOME,HADOOP_COMMON_HOME,
> > HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_
> > DISTCACHE,HADOOP_YARN_HOME,MAPRED_HOME_DIR
> >
> > I don't know the exact reason for the change, but the 2.7.3 was more
> > userfriendly as the example could be run without specific configuration.
> >
> > For the same reason I didn't start hbase mapreduce job with hbase command
> > line app (There could be some option for hbase to define MAPRED_HOME_DIR
> as
> > well, but by default I got ClassNotFoundException for one of the MR
> class)
> >
> > 2. For the records: The logging and htrace classes are excluded from the
> > shaded hadoop client jar so I added it manually one by one to the spark
> > (spark 2.1.0 distribution without hadoop):
> >
> > RUN wget `cat url` -O spark.tar.gz && tar zxf spark.tar.gz && rm
> > spark.tar.gz && mv spark* spark
> > RUN cp /opt/hadoop/share/hadoop/client/hadoop-client-api-3.0.
> 0-alpha2.jar
> > /opt/spark/jars
> > RUN cp /opt/hadoop/share/hadoop/client/hadoop-client-runtime-
> 3.0.0-alpha2.jar
> > /opt/spark/jars
> > ADD https://repo1.maven.org/maven2/org/slf4j/slf4j-
> > log4j12/1.7.10/slf4j-log4j12-1.7.10.jar /opt/spark/jars
> > ADD https://repo1.maven.org/maven2/org/apache/htrace/
> > htrace-core4/4.1.0-incubating/htrace-core4-4.1.0-incubating.jar
> > /opt/spark/jars
> > ADD https://repo1.maven.org/maven2/org/slf4j/slf4j-api/1.
> > 7.10/slf4j-api-1.7.10.jar /opt/spark/jars/
> > ADD https://repo1.maven.org/maven2/log4j/log4j/1.2.17/log4j-1.2.17.jar
> > /opt/spark/jars
> >
> > With this jars files spark 2.1.0 works well with the alpha2 version of
> > HDFS and YARN.
> >
> > 3. The messages "Upgrade in progress. Not yet finalized." wasn't
> > disappeared from the namenode webui but the cluster works well.
> >
> > Most probably I missed to do something, but it's a little bit confusing.
> >
> > (I checked the REST call, it is the jmx bean who reports that it was not
> > yet finalized, the code of the webpage seems to be ok.)
> >
> > Regards
> > Marton
> >
> > On Jan 25, 2017, at 8:38 AM, Yongjun Zhang <yjzhangal@apache.org<mailto:
> y
> > jzhangal@apache.org>> wrote:
> >
> > Thanks Andrew much for the work here!
> >
> > +1 (binding).
> >
> > - Downloaded both binary and src tarballs
> > - Verified md5 checksum and signature for both
> > - Built from source tarball
> > - Deployed 2 pseudo clusters, one with the released tarball and the other
> >  with what I built from source, and did the following on both:
> >     - Run basic HDFS operations, snapshots and distcp jobs
> >     - Run pi job
> >     - Examined HDFS webui, YARN webui.
> >
> > Best,
> >
> > --Yongjun
> >
> >
> > On Tue, Jan 24, 2017 at 3:56 PM, Eric Badger
> <ebadger@yahoo-inc.com.invalid
> > <mailto:ebadger@yahoo-inc.com.invalid>>
> > wrote:
> >
> > +1 (non-binding)
> > - Verified signatures and md5- Built from source- Started single-node
> > cluster on my mac- Ran some sleep jobs
> > Eric
> >
> >   On Tuesday, January 24, 2017 4:32 PM, Yufei Gu <flyrain000@gmail.com
> > <mailto:flyrain000@gmail.com>>
> > wrote:
> >
> >
> > Hi Andrew,
> >
> > Thanks for working on this.
> >
> > +1 (Non-Binding)
> >
> > 1. Downloaded the binary and verified the md5.
> > 2. Deployed it on 3 node cluster with 1 ResourceManager and 2
> NodeManager.
> > 3. Set YARN to use Fair Scheduler.
> > 4. Ran MapReduce jobs Pi
> > 5. Verified Hadoop version command output is correct.
> >
> > Best,
> >
> > Yufei
> >
> > On Tue, Jan 24, 2017 at 3:02 AM, Marton Elek <melek@hortonworks.com
> > <mailto:melek@hortonworks.com>>
> > wrote:
> >
> > ]>
> > minicluster is kind of weird on filesystems that don't support mixed
> > case, like OS X's default HFS+.
> >
> > $  jar tf hadoop-client-minicluster-3.0.0-alpha3-SNAPSHOT.jar | grep
> > -i
> > license
> > LICENSE.txt
> > license/
> > license/LICENSE
> > license/LICENSE.dom-documentation.txt
> > license/LICENSE.dom-software.txt
> > license/LICENSE.sax.txt
> > license/NOTICE
> > license/README.dom.txt
> > license/README.sax.txt
> > LICENSE
> > Grizzly_THIRDPARTYLICENSEREADME.txt
> >
> >
> > I added a patch to https://issues.apache.org/jira/browse/HADOOP-14018 to
> > add the missing META-INF/LICENSE.txt to the shaded files.
> >
> > Question: what should be done with the other LICENSE files in the
> > minicluster. Can we just exclude them (from legal point of view)?
> >
> > Regards,
> > Marton
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org<mailto:
> > yarn-dev-unsubscribe@hadoop.apache.org>
> > For additional commands, e-mail: yarn-dev-help@hadoop.apache.org<mailto:
> > yarn-dev-help@hadoop.apache.org>
> >
> >
> >
> >
> >
> >
> >
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message