incubator-giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yingyi Bu <buyin...@gmail.com>
Subject Re: PageRank OOM Exception
Date Fri, 18 Nov 2011 08:28:50 GMT
Could anyone fix the trunk:  two files miss headers so that build fails...

Attached is the target/rat.txt from the failed build.
I fixed them locally anyway...

Thanks!
Yingyi

On Thu, Nov 17, 2011 at 11:53 PM, Yingyi Bu <buyingyi@gmail.com> wrote:

> Avery,
>
>      Thanks a lot for help!!
>      I'll sync the trunk and try with your suggested settings.
>
> Best regards,
> Yingyi
>
> On Thu, Nov 17, 2011 at 11:47 PM, Avery Ching <aching@apache.org> wrote:
>
>>  Yingyi,
>>
>> Looks like you lost the connection to ZooKeeper.  You might want to sync
>> with trunk.  GIRAPH-11 changed the settings to allow longer ZooKeeper
>> timeouts.  Also, ordering of the vertices is no longer required and the
>> load balancing should be better.  Looks like you might want to try to add
>> some better GC options to reduce stop-the-world pauses (likely causing the
>> timeouts).
>>
>> Here's some example settings you can trying fiddling with as well just
>> add them to the other JVM settings you tried out earlier.  Let us know how
>> its goes.
>>
>>  -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -XX:ParallelGCThreads=8
>> -XX:+CMSIncrementalPacing -XX:+PrintGCDetails
>> -XX:CMSInitiatingOccupancyFraction=60 -XX:+UseCMSInitiatingOccupancyOnly
>> -XX:+PrintTenuringDistribution
>>
>> Avery
>>
>>
>> On 11/17/11 11:24 PM, Yingyi Bu wrote:
>>
>> Hi Avery,
>>
>>      Thanks a lot for your help!!
>>     I use your settings, and get rid of OOM now!   However, after running
>> the job for 10 minutes, one worker failed, and then for a while, all
>> mappers failed.  Attached below are mapper logs from two nodes.  It seems
>> they cannot connect to the Zookeeper.  The workers run well until the
>> highlighted exception.  Do I miss something in the job setting?
>>     Thanks, again!!
>>
>>  Best regards,
>> Yingyi
>>
>>
>>
>>  Mapper log on Node-1:
>>  2011-11-17 22:56:39,044 INFO org.apache.giraph.zk.ZooKeeperManager:
>> getZooKeeperServerList: For task 0, got file 'zkServerList_asterix-010 0 '
>> (polling period is 3000)
>> 2011-11-17 22:56:39,044 INFO org.apache.giraph.zk.ZooKeeperManager:
>> getZooKeeperServerList: Found [asterix-010, 0] 2 hosts in filename
>> 'zkServerList_asterix-010 0 '
>> 2011-11-17 22:56:39,046 INFO org.apache.giraph.zk.ZooKeeperManager:
>> onlineZooKeeperServers: Trying to delete old directory
>> /mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/work/_bspZooKeeper
>> 2011-11-17 22:56:39,049 INFO org.apache.giraph.zk.ZooKeeperManager:
>> generateZooKeeperConfigFile: Creating file
>> /mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/work/_bspZooKeeper/zoo.cfg
>> in
>> /mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/work/_bspZooKeeper
>> with base port 22181
>> 2011-11-17 22:56:39,049 INFO org.apache.giraph.zk.ZooKeeperManager:
>> generateZooKeeperConfigFile: Make directory of _bspZooKeeper = true
>> 2011-11-17 22:56:39,049 INFO org.apache.giraph.zk.ZooKeeperManager:
>> generateZooKeeperConfigFile: Delete of zoo.cfg = false
>> 2011-11-17 22:56:39,050 INFO org.apache.giraph.zk.ZooKeeperManager:
>> onlineZooKeeperServers: Attempting to start ZooKeeper server with command
>> [/mnt/data/sda/space/yingyi/tools/java/jre/bin/java, -Xmx256m,
>> -XX:ParallelGCThreads=4, -XX:+UseConcMarkSweepGC,
>> -XX:CMSInitiatingOccupancyFraction=70, -XX:MaxGCPauseMillis=100, -cp,
>> /mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/jars/job.jar,
>> org.apache.zookeeper.server.quorum.QuorumPeerMain,
>> /mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/work/_bspZooKeeper/zoo.cfg]
>> in directory
>> /mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/work/_bspZooKeeper
>> 2011-11-17 22:56:39,056 INFO org.apache.giraph.zk.ZooKeeperManager:
>> onlineZooKeeperServers: Connect attempt 0 of 10 max trying to connect to
>> asterix-010:22181 with poll msecs = 3000
>> 2011-11-17 22:56:39,058 WARN org.apache.giraph.zk.ZooKeeperManager:
>> onlineZooKeeperServers: Got ConnectException
>> java.net.ConnectException: Connection refused
>>         at java.net.PlainSocketImpl.socketConnect(Native Method)
>>         at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
>>         at
>> java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
>>         at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
>>         at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
>>         at java.net.Socket.connect(Socket.java:529)
>>         at
>> org.apache.giraph.zk.ZooKeeperManager.onlineZooKeeperServers(ZooKeeperManager.java:612)
>>         at org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:401)
>>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
>>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>>         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>         at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
>>         at org.apache.hadoop.mapred.Child.main(Child.java:249)
>> 2011-11-17 22:56:42,062 INFO org.apache.giraph.zk.ZooKeeperManager:
>> onlineZooKeeperServers: Connect attempt 1 of 10 max trying to connect to
>> asterix-010:22181 with poll msecs = 3000
>> 2011-11-17 22:56:42,063 INFO org.apache.giraph.zk.ZooKeeperManager:
>> onlineZooKeeperServers: Connected!
>> 2011-11-17 22:56:42,064 INFO org.apache.giraph.zk.ZooKeeperManager:
>> onlineZooKeeperServers: Creating my filestamp
>> _bsp/_defaultZkManagerDir/job_201111172247_0003/_zkServer/asterix-010 0
>> 2011-11-17 22:56:42,070 INFO org.apache.giraph.graph.GraphMapper: setup:
>> Starting up BspServiceMaster (master thread)...
>> 2011-11-17 22:56:42,080 INFO org.apache.giraph.graph.BspService:
>> BspService: Connecting to ZooKeeper with job job_201111172247_0003, 0 on
>> asterix-010:22181
>> 2011-11-17 22:56:42,086 INFO org.apache.zookeeper.ZooKeeper: Client
>> environment:zookeeper.version=3.3.1-942149, built on 05/07/2010 17:14 GMT
>> 2011-11-17 22:56:42,086 INFO org.apache.zookeeper.ZooKeeper: Client
>> environment:host.name=asterix-010
>> 2011-11-17 22:56:42,086 INFO org.apache.zookeeper.ZooKeeper: Client
>> environment:java.version=1.6.0_21
>> 2011-11-17 22:56:42,086 INFO org.apache.zookeeper.ZooKeeper: Client
>> environment:java.vendor=Sun Microsystems Inc.
>> 2011-11-17 22:56:42,086 INFO org.apache.zookeeper.ZooKeeper: Client
>> environment:java.home=/mnt/data/sda/space/yingyi/tools/java/jre
>> 2011-11-17 22:56:42,086 INFO org.apache.zookeeper.ZooKeeper: Client
>> environment:java.class.path=/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/jars/classes:/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/jars:/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/attempt_201111172247_0003_m_000000_0/work:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../conf:/mnt/data/sda/space/yingyi/tools/java/lib/tools.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../build/classes:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../build:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../build/test/classes:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../build/tools:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/hadoop-core-0.20.205.0.jar:/mnt/data/sda/space/yingyi/hadoop-
>> 0.20.205.0/libexec/../share/hadoop/lib/asm-3.2.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/aspectjrt-1.6.5.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/aspectjtools-1.6.5.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-beanutils-1.7.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-beanutils-core-1.8.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-cli-1.2.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-codec-1.4.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-collections-3.2.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-configuration-1.6.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-daemon-1.0.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexe
>> c/../share/hadoop/lib/commons-digester-1.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-el-1.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-httpclient-3.0.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-lang-2.4.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-logging-1.1.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-logging-api-1.0.4.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-math-2.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-net-1.4.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/core-3.1.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/hadoop-capacity-scheduler-0.20.205.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/
>> hadoop/lib/hadoop-fairscheduler-0.20.205.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/hadoop-thriftfs-0.20.205.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/hsqldb-1.8.0.10.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jackson-core-asl-1.0.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jackson-mapper-asl-1.0.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jasper-compiler-5.5.12.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jasper-runtime-5.5.12.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jdeb-0.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jersey-core-1.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jersey-json-1.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/
>> jersey-server-1.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jets3t-0.6.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jetty-6.1.26.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jetty-util-6.1.26.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jsch-0.1.42.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/junit-4.5.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/kfs-0.2.2.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/log4j-1.2.15.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/mockito-all-1.8.5.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/oro-2.0.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/servlet-api-2.5-20081211.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../
>> share/hadoop/lib/slf4j-api-1.4.3.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/slf4j-log4j12-1.4.3.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/xmlenc-0.52.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jsp-2.1/jsp-2.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jsp-2.1/jsp-api-2.1.jar
>>          at javax.security.auth.Subject.doAs(Subject.java:396)
>>         at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
>>         at org.apache.hadoop.mapred.Child.main(Child.java:249)
>> 2011-11-17 22:56:42,062 INFO org.apache.giraph.zk.ZooKeeperManager:
>> onlineZooKeeperServers: Connect attempt 1 of 10 max trying to connect to
>> asterix-010:22181 with poll msecs = 3000
>> 2011-11-17 22:56:42,063 INFO org.apache.giraph.zk.ZooKeeperManager:
>> onlineZooKeeperServers: Connected!
>> 2011-11-17 22:56:42,064 INFO org.apache.giraph.zk.ZooKeeperManager:
>> onlineZooKeeperServers: Creating my filestamp
>> _bsp/_defaultZkManagerDir/job_201111172247_0003/_zkServer/asterix-010 0
>> 2011-11-17 22:56:42,070 INFO org.apache.giraph.graph.GraphMapper: setup:
>> Starting up BspServiceMaster (master thread)...
>> 2011-11-17 22:56:42,080 INFO org.apache.giraph.graph.BspService:
>> BspService: Connecting to ZooKeeper with job job_201111172247_0003, 0 on
>> asterix-010:22181
>> 2011-11-17 22:56:42,086 INFO org.apache.zookeeper.ZooKeeper: Client
>> environment:zookeeper.version=3.3.1-942149, built on 05/07/2010 17:14 GMT
>> 2011-11-17 22:56:42,086 INFO org.apache.zookeeper.ZooKeeper: Client
>> environment:host.name=asterix-010
>> 2011-11-17 22:56:42,086 INFO org.apache.zookeeper.ZooKeeper: Client
>> environment:java.version=1.6.0_21
>> 2011-11-17 22:56:42,086 INFO org.apache.zookeeper.ZooKeeper: Client
>> environment:java.vendor=Sun Microsystems Inc.
>> 2011-11-17 22:56:42,086 INFO org.apache.zookeeper.ZooKeeper: Client
>> environment:java.home=/mnt/data/sda/space/yingyi/tools/java/jre
>> 2011-11-17 22:56:42,086 INFO org.apache.zookeeper.ZooKeeper: Client
>> environment:java.class.path=/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/jars/classes:/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/jars:/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/attempt_201111172247_0003_m_000000_0/work:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../conf:/mnt/data/sda/space/yingyi/tools/java/lib/tools.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../build/classes:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../build:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../build/test/classes:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../build/tools:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/hadoop-core-0.20.205.0.jar:/mnt/data/sda/
>> space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/asm-3.2.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/aspectjrt-1.6.5.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/aspectjtools-1.6.5.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-beanutils-1.7.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-beanutils-core-1.8.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-cli-1.2.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-codec-1.4.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-collections-3.2.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-configuration-1.6.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-daemon-1.0.1.jar:/mnt/data/sda/space/yingyi/hado
>> op-0.20.205.0/libexec/../share/hadoop/lib/commons-digester-1.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-el-1.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-httpclient-3.0.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-lang-2.4.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-logging-1.1.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-logging-api-1.0.4.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-math-2.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-net-1.4.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/core-3.1.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/hadoop-capacity-scheduler-0.20.205.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205
>> .0/libexec/../share/hadoop/lib/hadoop-fairscheduler-0.20.205.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/hadoop-thriftfs-0.20.205.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/hsqldb-1.8.0.10.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jackson-core-asl-1.0.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jackson-mapper-asl-1.0.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jasper-compiler-5.5.12.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jasper-runtime-5.5.12.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jdeb-0.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jersey-core-1.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jersey-json-1.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/
>> ../share/hadoop/lib/jersey-server-1.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jets3t-0.6.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jetty-6.1.26.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jetty-util-6.1.26.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jsch-0.1.42.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/junit-4.5.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/kfs-0.2.2.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/log4j-1.2.15.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/mockito-all-1.8.5.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/oro-2.0.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/servlet-api-2.5-20081211.jar:/mnt/data/sda/space/yingyi/hadoop-0.
>> 20.205.0/libexec/../share/hadoop/lib/slf4j-api-1.4.3.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/slf4j-log4j12-1.4.3.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/xmlenc-0.52.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jsp-2.1/jsp-2.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jsp-2.1/jsp-api-2.1.jar2011-11-1722:56:42,087
INFO org.apache.zookeeper.ZooKeeper: Client
>> environment:java.library.path=/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../lib:/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/attempt_201111172247_0003_m_000000_0/work2011-11-17
>> 22:56:42,087 INFO org.apache.zookeeper.ZooKeeper: Client
>> environment:java.io.tmpdir=/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/attempt_201111172247_0003_m_000000_0/work/tmp
>> 2011-11-17 22:56:42,087 INFO org.apache.zookeeper.ZooKeeper: Client
>> environment:java.compiler=<NA>
>> 2011-11-17 22:56:42,087 INFO org.apache.zookeeper.ZooKeeper: Client
>> environment:os.name=Linux
>> 2011-11-17 22:56:42,087 INFO org.apache.zookeeper.ZooKeeper: Client
>> environment:os.arch=amd642011-11-17 22:56:42,087 INFO
>> org.apache.zookeeper.ZooKeeper: Client
>> environment:os.version=2.6.18-194.26.1.el52011-11-17 22:56:42,087 INFO
>> org.apache.zookeeper.ZooKeeper: Client environment:user.name=yingyib
>> 2011-11-17 22:56:42,087 INFO org.apache.zookeeper.ZooKeeper: Client
>> environment:user.home=/home/yingyib
>> 2011-11-17 22:56:42,087 INFO org.apache.zookeeper.ZooKeeper: Client
>> environment:user.dir=/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/attempt_201111172247_0003_m_000000_0/work
>> 2011-11-17 22:56:42,088 INFO org.apache.zookeeper.ZooKeeper: Initiating
>> client connection, connectString=asterix-010:22181 sessionTimeout=60000
>> watcher=org.apache.giraph.graph.BspServiceMaster@13a78071
>> 2011-11-17 22:56:42,098 INFO org.apache.zookeeper.ClientCnxn: Opening
>> socket connection to server asterix-010/10.0.0.10:22181
>> 2011-11-17 22:56:42,099 INFO org.apache.zookeeper.ClientCnxn: Socket
>> connection established to asterix-010/10.0.0.10:22181, initiating session
>> 2011-11-17 22:56:42,123 INFO org.apache.zookeeper.ClientCnxn: Session
>> establishment complete on server asterix-010/10.0.0.10:22181, sessionid
>> = 0x133b57675b60000, negotiated timeout = 60000
>> 2011-11-17 22:56:42,125 INFO org.apache.giraph.graph.BspService: process:
>> Asynchronous connection complete.
>> 2011-11-17 22:56:42,126 INFO org.apache.giraph.graph.GraphMapper: map: No
>> need to do anything when not a worker
>> 2011-11-17 22:56:42,126 INFO org.apache.giraph.graph.GraphMapper:
>> cleanup: Starting for MASTER_ZOOKEEPER_ONLY2011-11-17 22:56:42,197 INFO
>> org.apache.giraph.graph.BspServiceMaster: becomeMaster: First child is
>> '/_hadoopBsp/job_201111172247_0003/_masterElectionDir/asterix-010_00000000000'
>> and my bid is
>> '/_hadoopBsp/job_201111172247_0003/_masterElectionDir/asterix-010_00000000000'
>> 2011-11-17 22:56:42,197 INFO org.apache.giraph.graph.BspServiceMaster:
>> becomeMaster: I am now the master!
>> 2011-11-17 22:56:42,208 INFO org.apache.giraph.graph.BspService: process:
>> applicationAttemptChanged signaled
>> 2011-11-17 22:56:42,216 WARN org.apache.giraph.graph.BspService: process:
>> Unknown and unprocessed event
>> (path=/_hadoopBsp/job_201111172247_0003/_applicationAttemptsDir/0/_superstepDir,
>> type=NodeChildrenChanged, state=SyncConnected)
>> 2011-11-17 22:56:45,130 INFO
>> org.apache.hadoop.mapreduce.lib.input.FileInputFormat: Total input paths to
>> process : 10
>> 2011-11-17 22:56:45,227 INFO org.apache.giraph.graph.BspServiceMaster:
>> coordinateSuperstep: 0 out of 10 chosen workers finished on superstep -1
>> 2011-11-17 23:01:20,045 ERROR org.apache.zookeeper.ClientCnxn: Error
>> while calling watcher
>> java.lang.RuntimeException:
>> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode =
>> NoNode for
>> /_hadoopBsp/job_201111172247_0003/_applicationAttemptsDir/0/_superstepDir/-1/_vertexRangeAssignments
>>         at
>> org.apache.giraph.graph.BspService.getVertexRangeMap(BspService.java:885)
>>         at
>> org.apache.giraph.graph.BspServiceMaster.checkHealthyWorkerFailure(BspServiceMaster.java:1946)
>>         at
>> org.apache.giraph.graph.BspServiceMaster.processEvent(BspServiceMaster.java:1976)
>>         at
>> org.apache.giraph.graph.BspService.process(BspService.java:1095)
>>         at
>> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:488)
>> Caused by: org.apache.zookeeper.KeeperException$NoNodeException:
>> KeeperErrorCode = NoNode for
>> /_hadoopBsp/job_201111172247_0003/_applicationAttemptsDir/0/_superstepDir/-1/_vertexRangeAssignments
>>         at
>> org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
>>         at
>> org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>>         at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:921)
>>         at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:950)
>>         at
>> org.apache.giraph.graph.BspService.getVertexRangeMap(BspService.java:858)
>>         ... 4 more2011-11-17 23:01:22,009 INFO
>> org.apache.giraph.graph.BspServiceMaster: coordinateSuperstep: 0 out of 10
>> chosen workers finished on superstep -12011-11-17 23:11:27,357 WARN
>> org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Forced a
>> shutdown hook kill of the ZooKeeper process.
>>
>>
>>  Mapper log on Node-2:
>> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client
>> environment:zookeeper.version=3.3.1-942149, built on 05/07/2010 17:14 GMT
>> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client
>> environment:host.name=asterix-001
>> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client
>> environment:java.version=1.6.0_21
>> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client
>> environment:java.vendor=Sun Microsystems Inc.
>> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client
>> environment:java.home=/mnt/data/sda/space/yingyi/tools/java/jre
>> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client
>> environment:java.class.path=/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/jars/classes:/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/jars:/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/attempt_201111172247_0003_m_000008_0/work:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../conf:/mnt/data/sda/space/yingyi/tools/java/lib/tools.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../build/classes:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../build:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../build/test/classes:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../build/tools:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/hadoop-core-0.20.205.0.jar:/mnt/data/sda/space/yingyi/hadoop-
>> 0.20.205.0/libexec/../share/hadoop/lib/asm-3.2.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/aspectjrt-1.6.5.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/aspectjtools-1.6.5.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-beanutils-1.7.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-beanutils-core-1.8.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-cli-1.2.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-codec-1.4.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-collections-3.2.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-configuration-1.6.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-daemon-1.0.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexe
>> c/../share/hadoop/lib/commons-digester-1.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-el-1.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-httpclient-3.0.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-lang-2.4.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-logging-1.1.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-logging-api-1.0.4.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-math-2.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-net-1.4.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/core-3.1.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/hadoop-capacity-scheduler-0.20.205.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/
>> hadoop/lib/hadoop-fairscheduler-0.20.205.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/hadoop-thriftfs-0.20.205.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/hsqldb-1.8.0.10.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jackson-core-asl-1.0.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jackson-mapper-asl-1.0.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jasper-compiler-5.5.12.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jasper-runtime-5.5.12.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jdeb-0.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jersey-core-1.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jersey-json-1.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/
>> jersey-server-1.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jets3t-0.6.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jetty-6.1.26.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jetty-util-6.1.26.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jsch-0.1.42.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/junit-4.5.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/kfs-0.2.2.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/log4j-1.2.15.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/mockito-all-1.8.5.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/oro-2.0.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/servlet-api-2.5-20081211.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../
>> share/hadoop/lib/slf4j-api-1.4.3.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/slf4j-log4j12-1.4.3.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/xmlenc-0.52.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jsp-2.1/jsp-2.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jsp-2.1/jsp-api-2.1.jar
>>  2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client
>> environment:java.library.path=/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../lib:/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/attempt_201111172247_0003_m_000008_0/work
>> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client
>> environment:java.io.tmpdir=/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/attempt_201111172247_0003_m_000008_0/work/tmp
>> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client
>> environment:java.compiler=<NA>
>> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client
>> environment:os.name=Linux
>> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client
>> environment:os.arch=amd64
>> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client
>> environment:os.version=2.6.18-194.26.1.el5
>> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client
>> environment:user.name=yingyib
>> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client
>> environment:user.home=/home/yingyib
>> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client
>> environment:user.dir=/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/attempt_201111172247_0003_m_000008_0/work
>> 2011-11-17 22:56:44,159 INFO org.apache.zookeeper.ZooKeeper: Initiating
>> client connection, connectString=asterix-010:22181 sessionTimeout=60000
>> watcher=org.apache.giraph.graph.BspServiceWorker@60ded0f0
>> 2011-11-17 22:56:44,171 INFO org.apache.zookeeper.ClientCnxn: Opening
>> socket connection to server asterix-010/10.0.0.10:22181
>> 2011-11-17 22:56:44,173 INFO org.apache.zookeeper.ClientCnxn: Socket
>> connection established to asterix-010/10.0.0.10:22181, initiating session
>> 2011-11-17 22:56:44,178 INFO org.apache.zookeeper.ClientCnxn: Session
>> establishment complete on server asterix-010/10.0.0.10:22181, sessionid
>> = 0x133b57675b60007, negotiated timeout = 60000
>> 2011-11-17 22:56:44,180 INFO org.apache.giraph.graph.BspService: process:
>> Asynchronous connection complete.
>> 2011-11-17 22:56:44,180 INFO org.apache.giraph.graph.GraphMapper: setup:
>> Registering health of this worker...
>> 2011-11-17 22:56:44,191 INFO org.apache.giraph.graph.BspService:
>> getJobState: Job state already exists
>> (/_hadoopBsp/job_201111172247_0003/_masterJobState)
>> 2011-11-17 22:56:44,195 INFO org.apache.giraph.graph.BspService:
>> getApplicationAttempt: Node
>> /_hadoopBsp/job_201111172247_0003/_applicationAttemptsDir already exists!
>> 2011-11-17 22:56:44,198 INFO org.apache.giraph.graph.BspService:
>> getApplicationAttempt: Node
>> /_hadoopBsp/job_201111172247_0003/_applicationAttemptsDir already exists!
>> 2011-11-17 22:56:44,204 INFO org.apache.giraph.graph.BspServiceWorker:
>> registerHealth: Created my health node for attempt=0, superstep=-1 with
>> /_hadoopBsp/job_201111172247_0003/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/asterix-001_8
>> and hostnamePort = ["asterix-001",30008]
>> 2011-11-17 22:56:45,177 INFO org.apache.giraph.graph.BspService: process:
>> inputSplitsReadyChanged (input splits ready)
>> 2011-11-17 22:56:45,192 WARN org.apache.giraph.graph.BspService: process:
>> Unknown and unprocessed event
>> (path=/_hadoopBsp/job_201111172247_0003/_inputSplitsDir/2/_inputSplitReserved,
>> type=NodeCreated, state=SyncConnected)
>> 2011-11-17 22:56:45,192 INFO org.apache.giraph.graph.BspServiceWorker:
>> reserveInputSplit: Reserved input split path
>> /_hadoopBsp/job_201111172247_0003/_inputSplitsDir/2
>> 2011-11-17 22:56:45,196 INFO org.apache.giraph.graph.BspServiceWorker:
>> loadVertices: Reserved /_hadoopBsp/job_201111172247_0003/_inputSplitsDir/2
>> from ZooKeeper and got input split
>> 'hdfs://asterix-master:31888/webmap-tiny-sorted/part-00002:0+834285620'
>> 2011-11-17 23:01:20,608 INFO org.apache.zookeeper.ClientCnxn: Client
>> session timed out, have not heard from server in 59117ms for sessionid
>> 0x133b57675b60007, closing socket connection and attempting reconnect
>>  2011-11-17 23:02:06,630 ERROR org.apache.zookeeper.ClientCnxn: Error
>> while calling watcher
>> java.lang.RuntimeException: process: Disconnected from ZooKeeper, cannot
>> recover.
>>         at org.apache.giraph.graph.BspService.process(BspService.java:990)
>>         at
>> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:488)
>> 2011-11-17 23:02:35,793 INFO org.apache.zookeeper.ClientCnxn: Opening
>> socket connection to server asterix-010/10.0.0.10:22181
>> 2011-11-17 23:02:35,794 INFO org.apache.zookeeper.ClientCnxn: Socket
>> connection established to asterix-010/10.0.0.10:22181, initiating session
>> 2011-11-17 23:02:35,806 INFO org.apache.zookeeper.ClientCnxn: Unable to
>> reconnect to ZooKeeper service, session 0x133b57675b60007 has expired,
>> closing socket connection
>>
>>  On Thu, Nov 17, 2011 at 9:46 PM, Avery Ching <aching@apache.org> wrote:
>>
>>>  Hi Yingyi,
>>>
>>> Here are some ideas you might want to try:
>>>
>>> 1)  Limit the thread stack size.
>>>
>>> 2  You can set the heap available to the mapper jvm.
>>>
>>> I.e. Here's a setting to get 10 GB of heap and use a smaller stack (64k)
>>> for the threads.
>>>
>>> -Dmapred.child.java.opts="-Xms10g -Xmx10g -Xss64k"
>>>
>>> Also, you might want to try using the EdgeListVertex instead of Vertex
>>> (i.e. GiraphJob.setVertexClass(EdgeListVertex.class)), it is quite a bit
>>> smaller.
>>>
>>> Let us know if that helps you.  You should also check to see if your
>>> Hadoop installation is using a 32-bit of 64-bit JVM.  If it's 32-bit you
>>> will be limited in how much heap you can use.
>>>
>>> Avery
>>>
>>>
>>> On 11/17/11 9:38 PM, Yingyi Bu wrote:
>>>
>>> Hi,
>>>
>>>     I'm running a Giraph PageRank job.  I tried with 8GB input text data
>>> over 10 nodes (each has 4 core,  4 disks,  and 12GB physical memory),  that
>>> is 800MB input-data/machine.    However,  Giraph job fails because of high
>>> GC costs and Out-of-Memory exception.
>>>      Do I set some special things in Hadoop configurations, for
>>> example,  maximum heap size for map task vm ?
>>>     Thanks!!
>>>
>>>  Best regards,
>>> Yingyi
>>>
>>>
>>>
>>
>>
>

Mime
  • Unnamed multipart/mixed (inline, None, 0 bytes)
View raw message