giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Claudio Martella (JIRA)" <>
Subject [jira] [Commented] (GIRAPH-850) Improve internal zookeeper launching
Date Wed, 19 Feb 2014 12:29:20 GMT


Claudio Martella commented on GIRAPH-850:

patch looks good! I'm committing this one.

> Improve internal zookeeper launching
> ------------------------------------
>                 Key: GIRAPH-850
>                 URL:
>             Project: Giraph
>          Issue Type: Bug
>          Components: zookeeper
>            Reporter: Alexandre Fonseca
>             Fix For: 1.1.0
>         Attachments: GIRAPH-850-2.patch, GIRAPH-850.patch
> With the most up to date trunk, internal zookeeper launching only appears to work with
Hadoop 1.x.x MR1.
> With Hadoop 2.x.x MR2, trying to run a job without specifying an external zookeeper location
results in a failed job with the following in the logs:
> {code}
> 2014-02-12 17:30:30,281 INFO [main] org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers:
Attempting to start ZooKeeper server with command [/usr/lib/jvm/java-1.7.0-openjdk-,
-Xmx512m, -XX:ParallelGCThr
> eads=4, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=70, -XX:MaxGCPauseMillis=100,
-cp, /tmp/hadoop-yarn/staging/b.ajf/.staging/job_1392221733726_0002/job.jar, org.apache.zookeeper.server.quorum.QuorumPeerMain,
> .ajf/nm-local-dir/usercache/b.ajf/appcache/application_1392221733726_0002/work/_bspZooKeeper/zoo.cfg]
in directory /tmp/hadoop-b.ajf/nm-local-dir/usercache/b.ajf/appcache/application_1392221733726_0002/work/_bspZooKeeper
> (...)
> 2014-02-12 17:30:30,285 INFO [main] org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers:
Connect attempt 0 of 10 max trying to connect to igraph-02.hi.inet:22181 with poll msecs =
> 2014-02-12 17:30:30,289 WARN [main] org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers:
Got ConnectException
> Connection refused
> (...)
> 2014-02-12 17:30:30,413 INFO [org.apache.giraph.zk.ZooKeeperManager$StreamCollector]
org.apache.giraph.zk.ZooKeeperManager$StreamCollector: readLines: Error: Could not find or
load main class org.apache.zookeeper.server.quorum.QuorumPeerMain
> (...)
> {code}
> It clearly is unable to launch Zookeeper as it can't find the necessary class in the
classpath. Looking at the command with which it tries to launch Zookeeper, we can see that
it has specified a classpath of:
> {code}
> -cp, /tmp/hadoop/yarn/staging/b.ajf/.staging/job_1392221733726_0002/job.jar
> {code}
> which is a HDFS location.
> It seems that with Hadoop 2.x.x, the function Job.getJar() returns a HDFS path to the
jar instead of the path to the local copy of the jar in the DirectoryCache. Hadoop 1.x.x appears
to return a correct path as I didn't detect any problem there.
> The whole logic of finding the Zookeeper classpath seems extremely convoluted to me (not
to mention broken as just shown for both MR2 and YARN). Since the currently running Java process
has to have the zookeeper classes in its classpath anyway (because some of the classes in
Giraph refer to Zookeeper classes), wouldn't it make more sense to just have the child java
process starting Zookeeper simply inherit the classpath?

This message was sent by Atlassian JIRA

View raw message