giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Claudio Martella <claudio.marte...@gmail.com>
Subject Re: PageRank OOM Exception
Date Fri, 18 Nov 2011 09:16:23 GMT
Thanks, we'll fix that.

Meanwhile use this patch to get trunk to build.

On Fri, Nov 18, 2011 at 9:28 AM, Yingyi Bu <buyingyi@gmail.com> wrote:
> Could anyone fix the trunk:  two files miss headers so that build fails...
>
> Attached is the target/rat.txt from the failed build.
> I fixed them locally anyway...
> Thanks!
> Yingyi
> On Thu, Nov 17, 2011 at 11:53 PM, Yingyi Bu <buyingyi@gmail.com> wrote:
>>
>> Avery,
>>      Thanks a lot for help!!
>>      I'll sync the trunk and try with your suggested settings.
>> Best regards,
>> Yingyi
>> On Thu, Nov 17, 2011 at 11:47 PM, Avery Ching <aching@apache.org> wrote:
>>>
>>> Yingyi,
>>>
>>> Looks like you lost the connection to ZooKeeper.  You might want to sync
>>> with trunk.  GIRAPH-11 changed the settings to allow longer ZooKeeper
>>> timeouts.  Also, ordering of the vertices is no longer required and the load
>>> balancing should be better.  Looks like you might want to try to add some
>>> better GC options to reduce stop-the-world pauses (likely causing the
>>> timeouts).
>>>
>>> Here's some example settings you can trying fiddling with as well just
>>> add them to the other JVM settings you tried out earlier.  Let us know how
>>> its goes.
>>>
>>>  -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -XX:ParallelGCThreads=8
>>> -XX:+CMSIncrementalPacing -XX:+PrintGCDetails
>>> -XX:CMSInitiatingOccupancyFraction=60 -XX:+UseCMSInitiatingOccupancyOnly
>>> -XX:+PrintTenuringDistribution
>>>
>>> Avery
>>>
>>> On 11/17/11 11:24 PM, Yingyi Bu wrote:
>>>
>>> Hi Avery,
>>>     Thanks a lot for your help!!
>>>     I use your settings, and get rid of OOM now!   However, after running
>>> the job for 10 minutes, one worker failed, and then for a while, all mappers
>>> failed.  Attached below are mapper logs from two nodes.  It seems they
>>> cannot connect to the Zookeeper.  The workers run well until the highlighted
>>> exception.  Do I miss something in the job setting?
>>>     Thanks, again!!
>>> Best regards,
>>> Yingyi
>>>
>>>
>>> Mapper log on Node-1:
>>>  2011-11-17 22:56:39,044 INFO org.apache.giraph.zk.ZooKeeperManager:
>>> getZooKeeperServerList: For task 0, got file 'zkServerList_asterix-010 0 '
>>> (polling period is 3000)
>>> 2011-11-17 22:56:39,044 INFO org.apache.giraph.zk.ZooKeeperManager:
>>> getZooKeeperServerList: Found [asterix-010, 0] 2 hosts in filename
>>> 'zkServerList_asterix-010 0 '
>>> 2011-11-17 22:56:39,046 INFO org.apache.giraph.zk.ZooKeeperManager:
>>> onlineZooKeeperServers: Trying to delete old directory
>>> /mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/work/_bspZooKeeper
>>> 2011-11-17 22:56:39,049 INFO org.apache.giraph.zk.ZooKeeperManager:
>>> generateZooKeeperConfigFile: Creating file
>>> /mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/work/_bspZooKeeper/zoo.cfg
>>> in
>>> /mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/work/_bspZooKeeper
>>> with base port 22181
>>> 2011-11-17 22:56:39,049 INFO org.apache.giraph.zk.ZooKeeperManager:
>>> generateZooKeeperConfigFile: Make directory of _bspZooKeeper = true
>>> 2011-11-17 22:56:39,049 INFO org.apache.giraph.zk.ZooKeeperManager:
>>> generateZooKeeperConfigFile: Delete of zoo.cfg = false
>>> 2011-11-17 22:56:39,050 INFO org.apache.giraph.zk.ZooKeeperManager:
>>> onlineZooKeeperServers: Attempting to start ZooKeeper server with command
>>> [/mnt/data/sda/space/yingyi/tools/java/jre/bin/java, -Xmx256m,
>>> -XX:ParallelGCThreads=4, -XX:+UseConcMarkSweepGC,
>>> -XX:CMSInitiatingOccupancyFraction=70, -XX:MaxGCPauseMillis=100, -cp,
>>> /mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/jars/job.jar,
>>> org.apache.zookeeper.server.quorum.QuorumPeerMain,
>>> /mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/work/_bspZooKeeper/zoo.cfg]
>>> in directory
>>> /mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/work/_bspZooKeeper
>>> 2011-11-17 22:56:39,056 INFO org.apache.giraph.zk.ZooKeeperManager:
>>> onlineZooKeeperServers: Connect attempt 0 of 10 max trying to connect to
>>> asterix-010:22181 with poll msecs = 3000
>>> 2011-11-17 22:56:39,058 WARN org.apache.giraph.zk.ZooKeeperManager:
>>> onlineZooKeeperServers: Got ConnectException
>>> java.net.ConnectException: Connection refused
>>>         at java.net.PlainSocketImpl.socketConnect(Native Method)
>>>         at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
>>>         at
>>> java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
>>>         at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
>>>         at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
>>>         at java.net.Socket.connect(Socket.java:529)
>>>         at
>>> org.apache.giraph.zk.ZooKeeperManager.onlineZooKeeperServers(ZooKeeperManager.java:612)
>>>         at
>>> org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:401)
>>>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
>>>         at
>>> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>>>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>>>         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>>         at
>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
>>>         at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>> 2011-11-17 22:56:42,062 INFO org.apache.giraph.zk.ZooKeeperManager:
>>> onlineZooKeeperServers: Connect attempt 1 of 10 max trying to connect to
>>> asterix-010:22181 with poll msecs = 3000
>>> 2011-11-17 22:56:42,063 INFO org.apache.giraph.zk.ZooKeeperManager:
>>> onlineZooKeeperServers: Connected!
>>> 2011-11-17 22:56:42,064 INFO org.apache.giraph.zk.ZooKeeperManager:
>>> onlineZooKeeperServers: Creating my filestamp
>>> _bsp/_defaultZkManagerDir/job_201111172247_0003/_zkServer/asterix-010 0
>>> 2011-11-17 22:56:42,070 INFO org.apache.giraph.graph.GraphMapper: setup:
>>> Starting up BspServiceMaster (master thread)...
>>> 2011-11-17 22:56:42,080 INFO org.apache.giraph.graph.BspService:
>>> BspService: Connecting to ZooKeeper with job job_201111172247_0003, 0 on
>>> asterix-010:22181
>>> 2011-11-17 22:56:42,086 INFO org.apache.zookeeper.ZooKeeper: Client
>>> environment:zookeeper.version=3.3.1-942149, built on 05/07/2010 17:14 GMT
>>> 2011-11-17 22:56:42,086 INFO org.apache.zookeeper.ZooKeeper: Client
>>> environment:host.name=asterix-010
>>> 2011-11-17 22:56:42,086 INFO org.apache.zookeeper.ZooKeeper: Client
>>> environment:java.version=1.6.0_21
>>> 2011-11-17 22:56:42,086 INFO org.apache.zookeeper.ZooKeeper: Client
>>> environment:java.vendor=Sun Microsystems Inc.
>>> 2011-11-17 22:56:42,086 INFO org.apache.zookeeper.ZooKeeper: Client
>>> environment:java.home=/mnt/data/sda/space/yingyi/tools/java/jre
>>> 2011-11-17 22:56:42,086 INFO org.apache.zookeeper.ZooKeeper: Client
>>> environment:java.class.path=/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/jars/classes:/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/jars:/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/attempt_201111172247_0003_m_000000_0/work:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../conf:/mnt/data/sda/space/yingyi/tools/java/lib/tools.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../build/classes:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../build:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../build/test/classes:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../build/tools:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/hadoop-core-0.20.205.0.jar:/mnt/data/sda/space/yingyi/hadoop-
>>> 0.20.205.0/libexec/../share/hadoop/lib/asm-3.2.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/aspectjrt-1.6.5.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/aspectjtools-1.6.5.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-beanutils-1.7.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-beanutils-core-1.8.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-cli-1.2.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-codec-1.4.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-collections-3.2.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-configuration-1.6.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-daemon-1.0.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexe
>>> c/../share/hadoop/lib/commons-digester-1.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-el-1.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-httpclient-3.0.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-lang-2.4.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-logging-1.1.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-logging-api-1.0.4.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-math-2.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-net-1.4.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/core-3.1.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/hadoop-capacity-scheduler-0.20.205.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/
>>> hadoop/lib/hadoop-fairscheduler-0.20.205.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/hadoop-thriftfs-0.20.205.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/hsqldb-1.8.0.10.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jackson-core-asl-1.0.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jackson-mapper-asl-1.0.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jasper-compiler-5.5.12.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jasper-runtime-5.5.12.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jdeb-0.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jersey-core-1.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jersey-json-1.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/
>>> jersey-server-1.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jets3t-0.6.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jetty-6.1.26.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jetty-util-6.1.26.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jsch-0.1.42.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/junit-4.5.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/kfs-0.2.2.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/log4j-1.2.15.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/mockito-all-1.8.5.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/oro-2.0.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/servlet-api-2.5-20081211.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../
>>> share/hadoop/lib/slf4j-api-1.4.3.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/slf4j-log4j12-1.4.3.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/xmlenc-0.52.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jsp-2.1/jsp-2.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jsp-2.1/jsp-api-2.1.jar
>>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>>         at
>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
>>>         at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>> 2011-11-17 22:56:42,062 INFO org.apache.giraph.zk.ZooKeeperManager:
>>> onlineZooKeeperServers: Connect attempt 1 of 10 max trying to connect to
>>> asterix-010:22181 with poll msecs = 3000
>>> 2011-11-17 22:56:42,063 INFO org.apache.giraph.zk.ZooKeeperManager:
>>> onlineZooKeeperServers: Connected!
>>> 2011-11-17 22:56:42,064 INFO org.apache.giraph.zk.ZooKeeperManager:
>>> onlineZooKeeperServers: Creating my filestamp
>>> _bsp/_defaultZkManagerDir/job_201111172247_0003/_zkServer/asterix-010 0
>>> 2011-11-17 22:56:42,070 INFO org.apache.giraph.graph.GraphMapper: setup:
>>> Starting up BspServiceMaster (master thread)...
>>> 2011-11-17 22:56:42,080 INFO org.apache.giraph.graph.BspService:
>>> BspService: Connecting to ZooKeeper with job job_201111172247_0003, 0 on
>>> asterix-010:22181
>>> 2011-11-17 22:56:42,086 INFO org.apache.zookeeper.ZooKeeper: Client
>>> environment:zookeeper.version=3.3.1-942149, built on 05/07/2010 17:14 GMT
>>> 2011-11-17 22:56:42,086 INFO org.apache.zookeeper.ZooKeeper: Client
>>> environment:host.name=asterix-010
>>> 2011-11-17 22:56:42,086 INFO org.apache.zookeeper.ZooKeeper: Client
>>> environment:java.version=1.6.0_21
>>> 2011-11-17 22:56:42,086 INFO org.apache.zookeeper.ZooKeeper: Client
>>> environment:java.vendor=Sun Microsystems Inc.
>>> 2011-11-17 22:56:42,086 INFO org.apache.zookeeper.ZooKeeper: Client
>>> environment:java.home=/mnt/data/sda/space/yingyi/tools/java/jre
>>> 2011-11-17 22:56:42,086 INFO org.apache.zookeeper.ZooKeeper: Client
>>> environment:java.class.path=/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/jars/classes:/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/jars:/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/attempt_201111172247_0003_m_000000_0/work:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../conf:/mnt/data/sda/space/yingyi/tools/java/lib/tools.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../build/classes:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../build:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../build/test/classes:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../build/tools:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/hadoop-core-0.20.205.0.jar:/mnt/data/sda/
>>> space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/asm-3.2.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/aspectjrt-1.6.5.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/aspectjtools-1.6.5.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-beanutils-1.7.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-beanutils-core-1.8.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-cli-1.2.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-codec-1.4.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-collections-3.2.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-configuration-1.6.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-daemon-1.0.1.jar:/mnt/data/sda/space/yingyi/hado
>>> op-0.20.205.0/libexec/../share/hadoop/lib/commons-digester-1.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-el-1.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-httpclient-3.0.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-lang-2.4.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-logging-1.1.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-logging-api-1.0.4.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-math-2.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-net-1.4.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/core-3.1.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/hadoop-capacity-scheduler-0.20.205.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205
>>> .0/libexec/../share/hadoop/lib/hadoop-fairscheduler-0.20.205.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/hadoop-thriftfs-0.20.205.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/hsqldb-1.8.0.10.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jackson-core-asl-1.0.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jackson-mapper-asl-1.0.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jasper-compiler-5.5.12.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jasper-runtime-5.5.12.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jdeb-0.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jersey-core-1.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jersey-json-1.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/
>>> ../share/hadoop/lib/jersey-server-1.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jets3t-0.6.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jetty-6.1.26.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jetty-util-6.1.26.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jsch-0.1.42.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/junit-4.5.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/kfs-0.2.2.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/log4j-1.2.15.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/mockito-all-1.8.5.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/oro-2.0.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/servlet-api-2.5-20081211.jar:/mnt/data/sda/space/yingyi/hadoop-0.
>>> 20.205.0/libexec/../share/hadoop/lib/slf4j-api-1.4.3.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/slf4j-log4j12-1.4.3.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/xmlenc-0.52.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jsp-2.1/jsp-2.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jsp-2.1/jsp-api-2.1.jar2011-11-17
>>> 22:56:42,087 INFO org.apache.zookeeper.ZooKeeper: Client
>>> environment:java.library.path=/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../lib:/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/attempt_201111172247_0003_m_000000_0/work2011-11-17
>>> 22:56:42,087 INFO org.apache.zookeeper.ZooKeeper: Client
>>> environment:java.io.tmpdir=/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/attempt_201111172247_0003_m_000000_0/work/tmp
>>> 2011-11-17 22:56:42,087 INFO org.apache.zookeeper.ZooKeeper: Client
>>> environment:java.compiler=<NA>
>>> 2011-11-17 22:56:42,087 INFO org.apache.zookeeper.ZooKeeper: Client
>>> environment:os.name=Linux
>>> 2011-11-17 22:56:42,087 INFO org.apache.zookeeper.ZooKeeper: Client
>>> environment:os.arch=amd642011-11-17 22:56:42,087 INFO
>>> org.apache.zookeeper.ZooKeeper: Client
>>> environment:os.version=2.6.18-194.26.1.el52011-11-17 22:56:42,087 INFO
>>> org.apache.zookeeper.ZooKeeper: Client environment:user.name=yingyib
>>> 2011-11-17 22:56:42,087 INFO org.apache.zookeeper.ZooKeeper: Client
>>> environment:user.home=/home/yingyib
>>> 2011-11-17 22:56:42,087 INFO org.apache.zookeeper.ZooKeeper: Client
>>> environment:user.dir=/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/attempt_201111172247_0003_m_000000_0/work
>>> 2011-11-17 22:56:42,088 INFO org.apache.zookeeper.ZooKeeper: Initiating
>>> client connection, connectString=asterix-010:22181 sessionTimeout=60000
>>> watcher=org.apache.giraph.graph.BspServiceMaster@13a78071
>>> 2011-11-17 22:56:42,098 INFO org.apache.zookeeper.ClientCnxn: Opening
>>> socket connection to server asterix-010/10.0.0.10:22181
>>> 2011-11-17 22:56:42,099 INFO org.apache.zookeeper.ClientCnxn: Socket
>>> connection established to asterix-010/10.0.0.10:22181, initiating session
>>> 2011-11-17 22:56:42,123 INFO org.apache.zookeeper.ClientCnxn: Session
>>> establishment complete on server asterix-010/10.0.0.10:22181, sessionid =
>>> 0x133b57675b60000, negotiated timeout = 60000
>>> 2011-11-17 22:56:42,125 INFO org.apache.giraph.graph.BspService: process:
>>> Asynchronous connection complete.
>>> 2011-11-17 22:56:42,126 INFO org.apache.giraph.graph.GraphMapper: map: No
>>> need to do anything when not a worker
>>> 2011-11-17 22:56:42,126 INFO org.apache.giraph.graph.GraphMapper:
>>> cleanup: Starting for MASTER_ZOOKEEPER_ONLY2011-11-17 22:56:42,197 INFO
>>> org.apache.giraph.graph.BspServiceMaster: becomeMaster: First child is
>>> '/_hadoopBsp/job_201111172247_0003/_masterElectionDir/asterix-010_00000000000'
>>> and my bid is
>>> '/_hadoopBsp/job_201111172247_0003/_masterElectionDir/asterix-010_00000000000'
>>> 2011-11-17 22:56:42,197 INFO org.apache.giraph.graph.BspServiceMaster:
>>> becomeMaster: I am now the master!
>>> 2011-11-17 22:56:42,208 INFO org.apache.giraph.graph.BspService: process:
>>> applicationAttemptChanged signaled
>>> 2011-11-17 22:56:42,216 WARN org.apache.giraph.graph.BspService: process:
>>> Unknown and unprocessed event
>>> (path=/_hadoopBsp/job_201111172247_0003/_applicationAttemptsDir/0/_superstepDir,
>>> type=NodeChildrenChanged, state=SyncConnected)
>>> 2011-11-17 22:56:45,130 INFO
>>> org.apache.hadoop.mapreduce.lib.input.FileInputFormat: Total input paths to
>>> process : 10
>>> 2011-11-17 22:56:45,227 INFO org.apache.giraph.graph.BspServiceMaster:
>>> coordinateSuperstep: 0 out of 10 chosen workers finished on superstep -1
>>> 2011-11-17 23:01:20,045 ERROR org.apache.zookeeper.ClientCnxn: Error
>>> while calling watcher
>>> java.lang.RuntimeException:
>>> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode =
>>> NoNode for
>>> /_hadoopBsp/job_201111172247_0003/_applicationAttemptsDir/0/_superstepDir/-1/_vertexRangeAssignments
>>>         at
>>> org.apache.giraph.graph.BspService.getVertexRangeMap(BspService.java:885)
>>>         at
>>> org.apache.giraph.graph.BspServiceMaster.checkHealthyWorkerFailure(BspServiceMaster.java:1946)
>>>         at
>>> org.apache.giraph.graph.BspServiceMaster.processEvent(BspServiceMaster.java:1976)
>>>         at
>>> org.apache.giraph.graph.BspService.process(BspService.java:1095)
>>>         at
>>> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:488)
>>> Caused by: org.apache.zookeeper.KeeperException$NoNodeException:
>>> KeeperErrorCode = NoNode for
>>> /_hadoopBsp/job_201111172247_0003/_applicationAttemptsDir/0/_superstepDir/-1/_vertexRangeAssignments
>>>         at
>>> org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
>>>         at
>>> org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>>>         at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:921)
>>>         at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:950)
>>>         at
>>> org.apache.giraph.graph.BspService.getVertexRangeMap(BspService.java:858)
>>>         ... 4 more2011-11-17 23:01:22,009 INFO
>>> org.apache.giraph.graph.BspServiceMaster: coordinateSuperstep: 0 out of 10
>>> chosen workers finished on superstep -12011-11-17 23:11:27,357 WARN
>>> org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Forced a
>>> shutdown hook kill of the ZooKeeper process.
>>>
>>> Mapper log on Node-2:
>>> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client
>>> environment:zookeeper.version=3.3.1-942149, built on 05/07/2010 17:14 GMT
>>> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client
>>> environment:host.name=asterix-001
>>> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client
>>> environment:java.version=1.6.0_21
>>> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client
>>> environment:java.vendor=Sun Microsystems Inc.
>>> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client
>>> environment:java.home=/mnt/data/sda/space/yingyi/tools/java/jre
>>> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client
>>> environment:java.class.path=/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/jars/classes:/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/jars:/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/attempt_201111172247_0003_m_000008_0/work:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../conf:/mnt/data/sda/space/yingyi/tools/java/lib/tools.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../build/classes:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../build:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../build/test/classes:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../build/tools:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/hadoop-core-0.20.205.0.jar:/mnt/data/sda/space/yingyi/hadoop-
>>> 0.20.205.0/libexec/../share/hadoop/lib/asm-3.2.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/aspectjrt-1.6.5.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/aspectjtools-1.6.5.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-beanutils-1.7.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-beanutils-core-1.8.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-cli-1.2.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-codec-1.4.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-collections-3.2.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-configuration-1.6.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-daemon-1.0.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexe
>>> c/../share/hadoop/lib/commons-digester-1.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-el-1.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-httpclient-3.0.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-lang-2.4.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-logging-1.1.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-logging-api-1.0.4.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-math-2.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-net-1.4.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/core-3.1.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/hadoop-capacity-scheduler-0.20.205.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/
>>> hadoop/lib/hadoop-fairscheduler-0.20.205.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/hadoop-thriftfs-0.20.205.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/hsqldb-1.8.0.10.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jackson-core-asl-1.0.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jackson-mapper-asl-1.0.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jasper-compiler-5.5.12.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jasper-runtime-5.5.12.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jdeb-0.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jersey-core-1.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jersey-json-1.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/
>>> jersey-server-1.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jets3t-0.6.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jetty-6.1.26.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jetty-util-6.1.26.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jsch-0.1.42.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/junit-4.5.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/kfs-0.2.2.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/log4j-1.2.15.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/mockito-all-1.8.5.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/oro-2.0.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/servlet-api-2.5-20081211.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../
>>> share/hadoop/lib/slf4j-api-1.4.3.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/slf4j-log4j12-1.4.3.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/xmlenc-0.52.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jsp-2.1/jsp-2.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jsp-2.1/jsp-api-2.1.jar
>>> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client
>>> environment:java.library.path=/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../lib:/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/attempt_201111172247_0003_m_000008_0/work
>>> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client
>>> environment:java.io.tmpdir=/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/attempt_201111172247_0003_m_000008_0/work/tmp
>>> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client
>>> environment:java.compiler=<NA>
>>> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client
>>> environment:os.name=Linux
>>> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client
>>> environment:os.arch=amd64
>>> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client
>>> environment:os.version=2.6.18-194.26.1.el5
>>> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client
>>> environment:user.name=yingyib
>>> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client
>>> environment:user.home=/home/yingyib
>>> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client
>>> environment:user.dir=/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/attempt_201111172247_0003_m_000008_0/work
>>> 2011-11-17 22:56:44,159 INFO org.apache.zookeeper.ZooKeeper: Initiating
>>> client connection, connectString=asterix-010:22181 sessionTimeout=60000
>>> watcher=org.apache.giraph.graph.BspServiceWorker@60ded0f0
>>> 2011-11-17 22:56:44,171 INFO org.apache.zookeeper.ClientCnxn: Opening
>>> socket connection to server asterix-010/10.0.0.10:22181
>>> 2011-11-17 22:56:44,173 INFO org.apache.zookeeper.ClientCnxn: Socket
>>> connection established to asterix-010/10.0.0.10:22181, initiating session
>>> 2011-11-17 22:56:44,178 INFO org.apache.zookeeper.ClientCnxn: Session
>>> establishment complete on server asterix-010/10.0.0.10:22181, sessionid =
>>> 0x133b57675b60007, negotiated timeout = 60000
>>> 2011-11-17 22:56:44,180 INFO org.apache.giraph.graph.BspService: process:
>>> Asynchronous connection complete.
>>> 2011-11-17 22:56:44,180 INFO org.apache.giraph.graph.GraphMapper: setup:
>>> Registering health of this worker...
>>> 2011-11-17 22:56:44,191 INFO org.apache.giraph.graph.BspService:
>>> getJobState: Job state already exists
>>> (/_hadoopBsp/job_201111172247_0003/_masterJobState)
>>> 2011-11-17 22:56:44,195 INFO org.apache.giraph.graph.BspService:
>>> getApplicationAttempt: Node
>>> /_hadoopBsp/job_201111172247_0003/_applicationAttemptsDir already exists!
>>> 2011-11-17 22:56:44,198 INFO org.apache.giraph.graph.BspService:
>>> getApplicationAttempt: Node
>>> /_hadoopBsp/job_201111172247_0003/_applicationAttemptsDir already exists!
>>> 2011-11-17 22:56:44,204 INFO org.apache.giraph.graph.BspServiceWorker:
>>> registerHealth: Created my health node for attempt=0, superstep=-1 with
>>> /_hadoopBsp/job_201111172247_0003/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/asterix-001_8
>>> and hostnamePort = ["asterix-001",30008]
>>> 2011-11-17 22:56:45,177 INFO org.apache.giraph.graph.BspService: process:
>>> inputSplitsReadyChanged (input splits ready)
>>> 2011-11-17 22:56:45,192 WARN org.apache.giraph.graph.BspService: process:
>>> Unknown and unprocessed event
>>> (path=/_hadoopBsp/job_201111172247_0003/_inputSplitsDir/2/_inputSplitReserved,
>>> type=NodeCreated, state=SyncConnected)
>>> 2011-11-17 22:56:45,192 INFO org.apache.giraph.graph.BspServiceWorker:
>>> reserveInputSplit: Reserved input split path
>>> /_hadoopBsp/job_201111172247_0003/_inputSplitsDir/2
>>> 2011-11-17 22:56:45,196 INFO org.apache.giraph.graph.BspServiceWorker:
>>> loadVertices: Reserved /_hadoopBsp/job_201111172247_0003/_inputSplitsDir/2
>>> from ZooKeeper and got input split
>>> 'hdfs://asterix-master:31888/webmap-tiny-sorted/part-00002:0+834285620'
>>> 2011-11-17 23:01:20,608 INFO org.apache.zookeeper.ClientCnxn: Client
>>> session timed out, have not heard from server in 59117ms for sessionid
>>> 0x133b57675b60007, closing socket connection and attempting reconnect
>>> 2011-11-17 23:02:06,630 ERROR org.apache.zookeeper.ClientCnxn: Error
>>> while calling watcher
>>> java.lang.RuntimeException: process: Disconnected from ZooKeeper, cannot
>>> recover.
>>>         at
>>> org.apache.giraph.graph.BspService.process(BspService.java:990)
>>>         at
>>> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:488)
>>> 2011-11-17 23:02:35,793 INFO org.apache.zookeeper.ClientCnxn: Opening
>>> socket connection to server asterix-010/10.0.0.10:22181
>>> 2011-11-17 23:02:35,794 INFO org.apache.zookeeper.ClientCnxn: Socket
>>> connection established to asterix-010/10.0.0.10:22181, initiating session
>>> 2011-11-17 23:02:35,806 INFO org.apache.zookeeper.ClientCnxn: Unable to
>>> reconnect to ZooKeeper service, session 0x133b57675b60007 has expired,
>>> closing socket connection
>>> On Thu, Nov 17, 2011 at 9:46 PM, Avery Ching <aching@apache.org> wrote:
>>>>
>>>> Hi Yingyi,
>>>>
>>>> Here are some ideas you might want to try:
>>>>
>>>> 1)  Limit the thread stack size.
>>>>
>>>> 2  You can set the heap available to the mapper jvm.
>>>>
>>>> I.e. Here's a setting to get 10 GB of heap and use a smaller stack (64k)
>>>> for the threads.
>>>>
>>>> -Dmapred.child.java.opts="-Xms10g -Xmx10g -Xss64k"
>>>>
>>>> Also, you might want to try using the EdgeListVertex instead of Vertex
>>>> (i.e. GiraphJob.setVertexClass(EdgeListVertex.class)), it is quite a bit
>>>> smaller.
>>>>
>>>> Let us know if that helps you.  You should also check to see if your
>>>> Hadoop installation is using a 32-bit of 64-bit JVM.  If it's 32-bit you
>>>> will be limited in how much heap you can use.
>>>>
>>>> Avery
>>>>
>>>> On 11/17/11 9:38 PM, Yingyi Bu wrote:
>>>>
>>>> Hi,
>>>>     I'm running a Giraph PageRank job.  I tried with 8GB input text data
>>>> over 10 nodes (each has 4 core,  4 disks,  and 12GB physical memory),  that
>>>> is 800MB input-data/machine.    However,  Giraph job fails because of
high
>>>> GC costs and Out-of-Memory exception.
>>>>     Do I set some special things in Hadoop configurations, for example,
>>>>  maximum heap size for map task vm ?
>>>>     Thanks!!
>>>> Best regards,
>>>> Yingyi
>>>
>>>
>>
>
>



-- 
   Claudio Martella
   claudio.martella@gmail.com

Mime
View raw message