giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Claudio Martella <claudio.marte...@gmail.com>
Subject Re: FileNotFoundException: File _bsp/_defaultZkManagerDir/job_201308291126_0029/_zkServer does not exist.
Date Wed, 04 Sep 2013 17:21:37 GMT
Giraph is shipped with Zookeeper 3.3.3, and it is run, if an existing
zookeeper is not used through the giraph.zkServerList parameter, with its
own configuration listening on port 22181.


On Wed, Sep 4, 2013 at 7:11 PM, Ken Williams <zoo9000@hotmail.com> wrote:

> Hmmmmmmmm. Interesting.
>
> Is Giraph (1.0.0) supposed to come with its own version of ZooKeeper ?
>
> The only version of ZooKeeper I have installed is the one that came with
> HBase,
> and the config file it uses /etc/zookeeper/conf/zoo.cfg specifies
> clientPort=2181
> This is the only zoo.cfg file on my machine.
>
>
> [root@localhost]# cat /etc/zookeeper/conf/zoo.cfg
> ....
> maxClientCnxns=50
> # The number of milliseconds of each tick
> tickTime=2000
> # The number of ticks that the initial
> # synchronization phase can take
> initLimit=10
> # The number of ticks that can pass between
> # sending a request and getting an acknowledgement
> syncLimit=5
> # the directory where the snapshot is stored.
> dataDir=/var/lib/zookeeper
> # the port at which the clients will connect
> clientPort=2181
> server.1=localhost:2888:3888
> [root@localhost Downloads]#
>
>
>
> ------------------------------
> From: claudio.martella@gmail.com
> Date: Wed, 4 Sep 2013 12:13:50 +0200
>
> Subject: Re: FileNotFoundException: File
> _bsp/_defaultZkManagerDir/job_201308291126_0029/_zkServer does not exist.
> To: user@giraph.apache.org
>
> That should in principle not be the case, as the zookeeper started by
> Giraph listens on a different port than the default. See
> parameter giraph.zkServerPort, which defaults to 22181.
>
>
> On Wed, Sep 4, 2013 at 11:40 AM, Ken Williams <zoo9000@hotmail.com> wrote:
>
> Hi Claudio,
>
>     I think I have fixed the problem.
>
>    HBase runs with its own copy of ZooKeeper which listens on port 2181.
>    So, when I tried to start ZooKeeper for Giraph it also tried to listen
> on port 2181
>    and found it was already in use, and then it terminated - which is why
> Giraph failed.
>    If I stop the HBase daemons (including its copy of ZooKeeper) then
> Giraph runs fine.
>
>    Essentially there is a conflict between running ZooKeeper for Giraph,
> if there is
>    already ZooKeeper running for HBase.
>
>    I will try the patch and get back to you.
>
>    Thanks for all your help,
>
> Ken
>
> ------------------------------
> From: claudio.martella@gmail.com
> Date: Tue, 3 Sep 2013 17:01:01 +0200
>
> Subject: Re: FileNotFoundException: File
> _bsp/_defaultZkManagerDir/job_201308291126_0029/_zkServer does not exist.
> To: user@giraph.apache.org
>
> try with the attached patch applied to trunk, without the mentioned -D
> giraph.zkManagerDirectory.
>
>
> On Tue, Sep 3, 2013 at 3:25 PM, Ken Williams <zoo9000@hotmail.com> wrote:
>
> Hi Claudio,
>
>     I tried this but it made no difference. The map tasks still fail,
> still no output, and still an
> exception in the log files - FileNotFoundException: File
> /tmp/giraph/_zkServer does not exist.
>
> [root@localhost giraph]# hadoop jar
> /usr/local/giraph/giraph-examples/target/giraph-examples-1.0.0-for-hadoop-2.0.0-alpha-jar-with-dependencies.jar
>   org.apache.giraph.GiraphRunner
>  -Dgiraph.zkManagerDirectory='/tmp/giraph/'
> org.apache.giraph.examples.SimpleShortestPathsVertex  -vif
> org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat
> -vip /user/root/input/tiny_graph.txt -of
> org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op
> /user/root/output/shortestpaths -w 1
> 13/09/03 14:19:58 INFO utils.ConfigurationUtils: No edge input format
> specified. Ensure your InputFormat does not require one.
> 13/09/03 14:19:58 WARN job.GiraphConfigurationValidator: Output format
> vertex index type is not known
> 13/09/03 14:19:58 WARN job.GiraphConfigurationValidator: Output format
> vertex value type is not known
> 13/09/03 14:19:58 WARN job.GiraphConfigurationValidator: Output format
> edge value type is not known
> 13/09/03 14:19:58 INFO job.GiraphJob: run: Since checkpointing is disabled
> (default), do not allow any task retries (setting mapred.map.max.attempts =
> 0, old value = 4)
> 13/09/03 14:19:58 WARN mapred.JobClient: Use GenericOptionsParser for
> parsing the arguments. Applications should implement Tool for the same.
> 13/09/03 14:20:01 INFO mapred.JobClient: Running job: job_201308291126_0039
> 13/09/03 14:20:02 INFO mapred.JobClient:  map 0% reduce 0%
> 13/09/03 14:20:12 INFO mapred.JobClient: Job complete:
> job_201308291126_0039
> 13/09/03 14:20:12 INFO mapred.JobClient: Counters: 6
> 13/09/03 14:20:12 INFO mapred.JobClient:   Job Counters
> 13/09/03 14:20:12 INFO mapred.JobClient:     Failed map tasks=1
> 13/09/03 14:20:12 INFO mapred.JobClient:     Launched map tasks=2
> 13/09/03 14:20:12 INFO mapred.JobClient:     Total time spent by all maps
> in occupied slots (ms)=16327
> 13/09/03 14:20:12 INFO mapred.JobClient:     Total time spent by all
> reduces in occupied slots (ms)=0
> 13/09/03 14:20:12 INFO mapred.JobClient:     Total time spent by all maps
> waiting after reserving slots (ms)=0
> 13/09/03 14:20:12 INFO mapred.JobClient:     Total time spent by all
> reduces waiting after reserving slots (ms)=0
> [root@localhost giraph]#
>
>
> When I try to run Zookeeper it still gives me an 'Address already in use'
> exception.
>
> [root@localhost giraph]# /usr/lib/zookeeper/bin/zkServer.sh
> start-foreground
> JMX enabled by default
> Using config: /usr/lib/zookeeper/bin/../conf/zoo.cfg
> 2013-09-03 14:23:37,882 [myid:] - INFO  [main:QuorumPeerConfig@101] -
> Reading configuration from: /usr/lib/zookeeper/bin/../conf/zoo.cfg
> 2013-09-03 14:23:37,888 [myid:] - ERROR [main:QuorumPeerConfig@283] -
> Invalid configuration, only one server specified (ignoring)
> 2013-09-03 14:23:37,889 [myid:] - INFO  [main:DatadirCleanupManager@78] -
> autopurge.snapRetainCount set to 3
> 2013-09-03 14:23:37,889 [myid:] - INFO  [main:DatadirCleanupManager@79] -
> autopurge.purgeInterval set to 0
> 2013-09-03 14:23:37,890 [myid:] - INFO  [main:DatadirCleanupManager@101]
> - Purge task is not scheduled.
> 2013-09-03 14:23:37,890 [myid:] - WARN  [main:QuorumPeerMain@118] -
> Either no config or no quorum defined in config, running  in standalone mode
> 2013-09-03 14:23:37,904 [myid:] - INFO  [main:QuorumPeerConfig@101] -
> Reading configuration from: /usr/lib/zookeeper/bin/../conf/zoo.cfg
> 2013-09-03 14:23:37,905 [myid:] - ERROR [main:QuorumPeerConfig@283] -
> Invalid configuration, only one server specified (ignoring)
> 2013-09-03 14:23:37,905 [myid:] - INFO  [main:ZooKeeperServerMain@100] -
> Starting server
> 2013-09-03 14:23:37,920 [myid:] - INFO  [main:Environment@100] - Server
> environment:zookeeper.version=3.4.3-cdh4.1.1--1, built on 10/16/2012 17:34
> GMT
> 2013-09-03 14:23:37,921 [myid:] - INFO  [main:Environment@100] - Server
> environment:host.name=localhost.localdomain
> 2013-09-03 14:23:37,921 [myid:] - INFO  [main:Environment@100] - Server
> environment:java.version=1.6.0_31
> 2013-09-03 14:23:37,921 [myid:] - INFO  [main:Environment@100] - Server
> environment:java.vendor=Sun Microsystems Inc.
> 2013-09-03 14:23:37,921 [myid:] - INFO  [main:Environment@100] - Server
> environment:java.home=/usr/java/jdk1.6.0_31/jre
> 2013-09-03 14:23:37,921 [myid:] - INFO  [main:Environment@100] - Server
> environment:java.class.path=/usr/lib/zookeeper/bin/../build/classes:/usr/lib/zookeeper/bin/../build/lib/*.jar:/usr/lib/zookeeper/bin/../lib/slf4j-log4j12-1.6.1.jar:/usr/lib/zookeeper/bin/../lib/slf4j-api-1.6.1.jar:/usr/lib/zookeeper/bin/../lib/netty-3.2.2.Final.jar:/usr/lib/zookeeper/bin/../lib/log4j-1.2.15.jar:/usr/lib/zookeeper/bin/../lib/jline-0.9.94.jar:/usr/lib/zookeeper/bin/../zookeeper-3.4.3-cdh4.1.1.jar:/usr/lib/zookeeper/bin/../src/java/lib/*.jar:/usr/lib/zookeeper/bin/../conf:
> 2013-09-03 14:23:37,922 [myid:] - INFO  [main:Environment@100] - Server
> environment:java.library.path=/usr/java/jdk1.6.0_31/jre/lib/i386/client:/usr/java/jdk1.6.0_31/jre/lib/i386:/usr/java/jdk1.6.0_31/jre/../lib/i386:/usr/java/packages/lib/i386:/lib:/usr/lib
> 2013-09-03 14:23:37,922 [myid:] - INFO  [main:Environment@100] - Server
> environment:java.io.tmpdir=/tmp
> 2013-09-03 14:23:37,922 [myid:] - INFO  [main:Environment@100] - Server
> environment:java.compiler=<NA>
> 2013-09-03 14:23:37,922 [myid:] - INFO  [main:Environment@100] - Server
> environment:os.name=Linux
> 2013-09-03 14:23:37,922 [myid:] - INFO  [main:Environment@100] - Server
> environment:os.arch=i386
> 2013-09-03 14:23:37,923 [myid:] - INFO  [main:Environment@100] - Server
> environment:os.version=2.6.32-279.14.1.el6.i686
> 2013-09-03 14:23:37,923 [myid:] - INFO  [main:Environment@100] - Server
> environment:user.name=root
> 2013-09-03 14:23:37,923 [myid:] - INFO  [main:Environment@100] - Server
> environment:user.home=/root
> 2013-09-03 14:23:37,923 [myid:] - INFO  [main:Environment@100] - Server
> environment:user.dir=/usr/local/giraph-1.0.0
> 2013-09-03 14:23:37,934 [myid:] - INFO  [main:ZooKeeperServer@726] -
> tickTime set to 2000
> 2013-09-03 14:23:37,934 [myid:] - INFO  [main:ZooKeeperServer@735] -
> minSessionTimeout set to -1
> 2013-09-03 14:23:37,935 [myid:] - INFO  [main:ZooKeeperServer@744] -
> maxSessionTimeout set to -1
> 2013-09-03 14:23:37,970 [myid:] - INFO  [main:NIOServerCnxnFactory@99] -
> binding to port 0.0.0.0/0.0.0.0:2181
> 2013-09-03 14:23:37,972 [myid:] - ERROR [main:ZooKeeperServerMain@68] -
> Unexpected exception, exiting abnormally
> java.net.BindException: Address already in use
> at sun.nio.ch.Net.bind(Native Method)
> at
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:126)
>  at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
> at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52)
>  at
> org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:100)
> at
> org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:115)
>  at
> org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:91)
> at
> org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:53)
>  at
> org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:121)
> at
> org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:79)
> [root@localhost giraph]#
>
>
>       Thank you for any help,
>
> Ken
>
>
>
>
> ------------------------------
> From: claudio.martella@gmail.com
> Date: Tue, 3 Sep 2013 12:43:59 +0200
>
> Subject: Re: FileNotFoundException: File
> _bsp/_defaultZkManagerDir/job_201308291126_0029/_zkServer does not exist.
> To: user@giraph.apache.org
>
>
> can you try defining the zookeeper manager directory from the command
> line? like this -D giraph.zkManagerDirectory=/path/in/hdfs/foobar
>
> you'll have to delete this directory by hand before each job. Just to see
> if it solves the problem. Then I could know how to fix it.
>
>
> On Tue, Sep 3, 2013 at 12:32 PM, Ken Williams <zoo9000@hotmail.com> wrote:
>
> Hi Pradeep,
>
> Yes, the zookeeper server is definitely running, I can connect to it with
> the
> command-line client
>
> [root@localhost giraph]# zkCli.sh  -server 127.0.0.1:2181
> Connecting to 127.0.0.1:2181
> 2013-09-03 11:15:45,987 [myid:] - INFO  [main:Environment@100] - Client
> environment:zookeeper.version=3.4.3-cdh4.1.1--1, built on 10/16/2012 17:34
> GMT
> 2013-09-03 11:15:45,990 [myid:] - INFO  [main:Environment@100] - Client
> environment:host.name=localhost.localdomain
> 2013-09-03 11:15:45,990 [myid:] - INFO  [main:Environment@100] - Client
> environment:java.version=1.6.0_31
> ......
> WatchedEvent state:SyncConnected type:None path:null
> [zk: 127.0.0.1:2181(CONNECTED) 0] ls /
> [hbase, zookeeper]
> [zk: 127.0.0.1:2181(CONNECTED) 1]
>
>
> However, I am a bit confused.
> If I look in the zookeeper log-file I see this port 2181 'Address already
> in use' error,
>
> 2013-09-03 10:52:24,412 [myid:] - INFO  [main:ZooKeeperServer@735] -
> minSessionTimeout set to -1
> 2013-09-03 10:52:24,413 [myid:] - INFO  [main:ZooKeeperServer@744] -
> maxSessionTimeout set to -1
> 2013-09-03 10:52:24,436 [myid:] - INFO  [main:NIOServerCnxnFactory@99] -
> binding to port 0.0.0.0/0.0.0.0:2181
> 2013-09-03 10:52:24,447 [myid:] - ERROR [main:ZooKeeperServerMain@68] -
> Unexpected exception, exiting abnormally
> java.net.BindException: Address already in use
> at sun.nio.ch.Net.bind(Native Method)
>  at
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:126)
> at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>  at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52)
> at
> org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:100)
>  at
> org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:115)
> at
> org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:91)
>
> The process listening on port 2181 is 2892, which turns out to be HBase.
>
> [root@localhost giraph]# fuser 2181/tcp
> 2181/tcp:             2892
> [root@localhost giraph]# ps aux | grep 2892
> hbase     2892  0.1  3.2 719592 119624 ?       Sl   Aug29   7:35
> /usr/java/jdk1.6.0_31/bin/java -XX:OnOutOfMemoryError=kill -9 %p -Xmx500m
> -XX:+UseConcMarkSweepGC -Dhbase.log.dir=/var/log/hbase
> -Dhbase.log.file=hbase-hbase-master-localhost.localdomain.log
> -Dhbase.home.dir=/usr/lib/hbase/bin/..
> ......
>
> So I am not sure what my zookeeper client is connecting to.
> It seems to be connecting to a zookeeper server but when I do 'ps' I
> cannot see
> a zookeeper server running.
> Here is my zoo.cfg file,
>
> maxClientCnxns=50
> # The number of milliseconds of each tick
> tickTime=2000
> # The number of ticks that the initial synchronization phase can take
> initLimit=10
> # The number of ticks that can pass between
> # sending a request and getting an acknowledgement
> syncLimit=5
> # the directory where the snapshot is stored.
> dataDir=/var/lib/zookeeper
> # the port at which the clients will connect
> clientPort=2181
> server.1=localhost:2888:3888
>
>     Thanks for any help,
>
> Ken
>
>
>
> --
>    Claudio Martella
>    claudio.martella@gmail.com
>
>
>
>
> --
>    Claudio Martella
>    claudio.martella@gmail.com
>
>
>
>
> --
>    Claudio Martella
>    claudio.martella@gmail.com
>



-- 
   Claudio Martella
   claudio.martella@gmail.com

Mime
View raw message