giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Avery Ching <ach...@apache.org>
Subject Re: PageRank OOM Exception
Date Fri, 18 Nov 2011 07:47:16 GMT
Yingyi,

Looks like you lost the connection to ZooKeeper.  You might want to sync 
with trunk.  GIRAPH-11 changed the settings to allow longer ZooKeeper 
timeouts.  Also, ordering of the vertices is no longer required and the 
load balancing should be better.  Looks like you might want to try to 
add some better GC options to reduce stop-the-world pauses (likely 
causing the timeouts).

Here's some example settings you can trying fiddling with as well just 
add them to the other JVM settings you tried out earlier.  Let us know 
how its goes.

  -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode 
-XX:ParallelGCThreads=8 -XX:+CMSIncrementalPacing -XX:+PrintGCDetails 
-XX:CMSInitiatingOccupancyFraction=60 -XX:+UseCMSInitiatingOccupancyOnly 
-XX:+PrintTenuringDistribution

Avery

On 11/17/11 11:24 PM, Yingyi Bu wrote:
> Hi Avery,
>
>     Thanks a lot for your help!!
>     I use your settings, and get rid of OOM now!   However, after 
> running the job for 10 minutes, one worker failed, and then for a 
> while, all mappers failed.  Attached below are mapper logs from two 
> nodes.  It seems they cannot connect to the Zookeeper.  The workers 
> run well until the highlighted exception.  Do I miss something in the 
> job setting?
>     Thanks, again!!
>
> Best regards,
> Yingyi
>
>
> Mapper log on Node-1:
>  2011-11-17 22:56:39,044 INFO org.apache.giraph.zk.ZooKeeperManager: 
> getZooKeeperServerList: For task 0, got file 'zkServerList_asterix-010 
> 0 ' (polling period is 3000)
> 2011-11-17 22:56:39,044 INFO org.apache.giraph.zk.ZooKeeperManager: 
> getZooKeeperServerList: Found [asterix-010, 0] 2 hosts in filename 
> 'zkServerList_asterix-010 0 '
> 2011-11-17 22:56:39,046 INFO org.apache.giraph.zk.ZooKeeperManager: 
> onlineZooKeeperServers: Trying to delete old directory 
> /mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/work/_bspZooKeeper
> 2011-11-17 22:56:39,049 INFO org.apache.giraph.zk.ZooKeeperManager: 
> generateZooKeeperConfigFile: Creating file 
> /mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/work/_bspZooKeeper/zoo.cfg

> in 
> /mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/work/_bspZooKeeper

> with base port 22181
> 2011-11-17 22:56:39,049 INFO org.apache.giraph.zk.ZooKeeperManager: 
> generateZooKeeperConfigFile: Make directory of _bspZooKeeper = true
> 2011-11-17 22:56:39,049 INFO org.apache.giraph.zk.ZooKeeperManager: 
> generateZooKeeperConfigFile: Delete of zoo.cfg = false
> 2011-11-17 22:56:39,050 INFO org.apache.giraph.zk.ZooKeeperManager: 
> onlineZooKeeperServers: Attempting to start ZooKeeper server with 
> command [/mnt/data/sda/space/yingyi/tools/java/jre/bin/java, -Xmx256m, 
> -XX:ParallelGCThreads=4, -XX:+UseConcMarkSweepGC, 
> -XX:CMSInitiatingOccupancyFraction=70, -XX:MaxGCPauseMillis=100, -cp, 
> /mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/jars/job.jar,

> org.apache.zookeeper.server.quorum.QuorumPeerMain, 
> /mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/work/_bspZooKeeper/zoo.cfg]

> in directory 
> /mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/work/_bspZooKeeper
> 2011-11-17 22:56:39,056 INFO org.apache.giraph.zk.ZooKeeperManager: 
> onlineZooKeeperServers: Connect attempt 0 of 10 max trying to connect 
> to asterix-010:22181 with poll msecs = 3000
> 2011-11-17 22:56:39,058 WARN org.apache.giraph.zk.ZooKeeperManager: 
> onlineZooKeeperServers: Got ConnectException
> java.net.ConnectException: Connection refused
>         at java.net.PlainSocketImpl.socketConnect(Native Method)
>         at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
>         at 
> java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
>         at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
>         at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
>         at java.net.Socket.connect(Socket.java:529)
>         at 
> org.apache.giraph.zk.ZooKeeperManager.onlineZooKeeperServers(ZooKeeperManager.java:612)
>         at org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:401)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
>         at org.apache.hadoop.mapred.Child.main(Child.java:249)
> 2011-11-17 22:56:42,062 INFO org.apache.giraph.zk.ZooKeeperManager: 
> onlineZooKeeperServers: Connect attempt 1 of 10 max trying to connect 
> to asterix-010:22181 with poll msecs = 3000
> 2011-11-17 22:56:42,063 INFO org.apache.giraph.zk.ZooKeeperManager: 
> onlineZooKeeperServers: Connected!
> 2011-11-17 22:56:42,064 INFO org.apache.giraph.zk.ZooKeeperManager: 
> onlineZooKeeperServers: Creating my filestamp 
> _bsp/_defaultZkManagerDir/job_201111172247_0003/_zkServer/asterix-010 0
> 2011-11-17 22:56:42,070 INFO org.apache.giraph.graph.GraphMapper: 
> setup: Starting up BspServiceMaster (master thread)...
> 2011-11-17 22:56:42,080 INFO org.apache.giraph.graph.BspService: 
> BspService: Connecting to ZooKeeper with job job_201111172247_0003, 0 
> on asterix-010:22181
> 2011-11-17 22:56:42,086 INFO org.apache.zookeeper.ZooKeeper: Client 
> environment:zookeeper.version=3.3.1-942149, built on 05/07/2010 17:14 GMT
> 2011-11-17 22:56:42,086 INFO org.apache.zookeeper.ZooKeeper: Client 
> environment:host.name <http://host.name>=asterix-010
> 2011-11-17 22:56:42,086 INFO org.apache.zookeeper.ZooKeeper: Client 
> environment:java.version=1.6.0_21
> 2011-11-17 22:56:42,086 INFO org.apache.zookeeper.ZooKeeper: Client 
> environment:java.vendor=Sun Microsystems Inc.
> 2011-11-17 22:56:42,086 INFO org.apache.zookeeper.ZooKeeper: Client 
> environment:java.home=/mnt/data/sda/space/yingyi/tools/java/jre
> 2011-11-17 22:56:42,086 INFO org.apache.zookeeper.ZooKeeper: Client 
> environment:java.class.path=/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/jars/classes:/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/jars:/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/attempt_201111172247_0003_m_000000_0/work:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../conf:/mnt/data/sda/space/yingyi/tools/java/lib/tools.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../build/classes:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../build:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../build/test/classes:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../build/tools:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/hadoop-core-0.20.205.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/asm-3.2.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/aspectjrt-1.6.5.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/aspectjtools-1.6.5.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-beanutils-1.7.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-beanutils-core-1.8.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-cli-1.2.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-codec-1.4.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-collections-3.2.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-configuration-1.6.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-daemon-1.0.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-digester-1.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-el-1.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-httpclient-3.0.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-lang-2.4.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-logging-1.1.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-logging-api-1.0.4.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-math-2.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-net-1.4.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/core-3.1.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/hadoop-capacity-scheduler-0.20.205.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/hadoop-fairscheduler-0.20.205.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/hadoop-thriftfs-0.20.205.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/hsqldb-1.8.0.10.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jackson-core-asl-1.0.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jackson-mapper-asl-1.0.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jasper-compiler-5.5.12.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jasper-runtime-5.5.12.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jdeb-0.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jersey-core-1.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jersey-json-1.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jersey-server-1.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jets3t-0.6.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jetty-6.1.26.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jetty-util-6.1.26.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jsch-0.1.42.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/junit-4.5.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/kfs-0.2.2.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/log4j-1.2.15.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/mockito-all-1.8.5.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/oro-2.0.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/servlet-api-2.5-20081211.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/slf4j-api-1.4.3.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/slf4j-log4j12-1.4.3.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/xmlenc-0.52.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jsp-2.1/jsp-2.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jsp-2.1/jsp-api-2.1.jar
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
>         at org.apache.hadoop.mapred.Child.main(Child.java:249)
> 2011-11-17 22:56:42,062 INFO org.apache.giraph.zk.ZooKeeperManager: 
> onlineZooKeeperServers: Connect attempt 1 of 10 max trying to connect 
> to asterix-010:22181 with poll msecs = 3000
> 2011-11-17 22:56:42,063 INFO org.apache.giraph.zk.ZooKeeperManager: 
> onlineZooKeeperServers: Connected!
> 2011-11-17 22:56:42,064 INFO org.apache.giraph.zk.ZooKeeperManager: 
> onlineZooKeeperServers: Creating my filestamp 
> _bsp/_defaultZkManagerDir/job_201111172247_0003/_zkServer/asterix-010 0
> 2011-11-17 22:56:42,070 INFO org.apache.giraph.graph.GraphMapper: 
> setup: Starting up BspServiceMaster (master thread)...
> 2011-11-17 22:56:42,080 INFO org.apache.giraph.graph.BspService: 
> BspService: Connecting to ZooKeeper with job job_201111172247_0003, 0 
> on asterix-010:22181
> 2011-11-17 22:56:42,086 INFO org.apache.zookeeper.ZooKeeper: Client 
> environment:zookeeper.version=3.3.1-942149, built on 05/07/2010 17:14 GMT
> 2011-11-17 22:56:42,086 INFO org.apache.zookeeper.ZooKeeper: Client 
> environment:host.name <http://host.name>=asterix-010
> 2011-11-17 22:56:42,086 INFO org.apache.zookeeper.ZooKeeper: Client 
> environment:java.version=1.6.0_21
> 2011-11-17 22:56:42,086 INFO org.apache.zookeeper.ZooKeeper: Client 
> environment:java.vendor=Sun Microsystems Inc.
> 2011-11-17 22:56:42,086 INFO org.apache.zookeeper.ZooKeeper: Client 
> environment:java.home=/mnt/data/sda/space/yingyi/tools/java/jre
> 2011-11-17 22:56:42,086 INFO org.apache.zookeeper.ZooKeeper: Client 
> environment:java.class.path=/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/jars/classes:/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/jars:/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/attempt_201111172247_0003_m_000000_0/work:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../conf:/mnt/data/sda/space/yingyi/tools/java/lib/tools.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../build/classes:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../build:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../build/test/classes:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../build/tools:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/hadoop-core-0.20.205.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/asm-3.2.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/aspectjrt-1.6.5.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/aspectjtools-1.6.5.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-beanutils-1.7.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-beanutils-core-1.8.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-cli-1.2.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-codec-1.4.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-collections-3.2.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-configuration-1.6.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-daemon-1.0.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-digester-1.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-el-1.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-httpclient-3.0.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-lang-2.4.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-logging-1.1.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-logging-api-1.0.4.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-math-2.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-net-1.4.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/core-3.1.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/hadoop-capacity-scheduler-0.20.205.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/hadoop-fairscheduler-0.20.205.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/hadoop-thriftfs-0.20.205.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/hsqldb-1.8.0.10.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jackson-core-asl-1.0.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jackson-mapper-asl-1.0.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jasper-compiler-5.5.12.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jasper-runtime-5.5.12.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jdeb-0.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jersey-core-1.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jersey-json-1.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jersey-server-1.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jets3t-0.6.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jetty-6.1.26.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jetty-util-6.1.26.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jsch-0.1.42.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/junit-4.5.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/kfs-0.2.2.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/log4j-1.2.15.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/mockito-all-1.8.5.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/oro-2.0.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/servlet-api-2.5-20081211.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/slf4j-api-1.4.3.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/slf4j-log4j12-1.4.3.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/xmlenc-0.52.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jsp-2.1/jsp-2.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jsp-2.1/jsp-api-2.1.jar2011-11-17

> 22:56:42,087 INFO org.apache.zookeeper.ZooKeeper: Client 
> environment:java.library.path=/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../lib:/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/attempt_201111172247_0003_m_000000_0/work2011-11-17

> 22:56:42,087 INFO org.apache.zookeeper.ZooKeeper: Client 
> environment:java.io.tmpdir=/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/attempt_201111172247_0003_m_000000_0/work/tmp
> 2011-11-17 22:56:42,087 INFO org.apache.zookeeper.ZooKeeper: Client 
> environment:java.compiler=<NA>
> 2011-11-17 22:56:42,087 INFO org.apache.zookeeper.ZooKeeper: Client 
> environment:os.name <http://os.name>=Linux
> 2011-11-17 22:56:42,087 INFO org.apache.zookeeper.ZooKeeper: Client 
> environment:os.arch=amd642011-11-17 22:56:42,087 INFO 
> org.apache.zookeeper.ZooKeeper: Client 
> environment:os.version=2.6.18-194.26.1.el52011-11-17 22:56:42,087 INFO 
> org.apache.zookeeper.ZooKeeper: Client environment:user.name 
> <http://user.name>=yingyib
> 2011-11-17 22:56:42,087 INFO org.apache.zookeeper.ZooKeeper: Client 
> environment:user.home=/home/yingyib
> 2011-11-17 22:56:42,087 INFO org.apache.zookeeper.ZooKeeper: Client 
> environment:user.dir=/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/attempt_201111172247_0003_m_000000_0/work
> 2011-11-17 22:56:42,088 INFO org.apache.zookeeper.ZooKeeper: 
> Initiating client connection, connectString=asterix-010:22181 
> sessionTimeout=60000 
> watcher=org.apache.giraph.graph.BspServiceMaster@13a78071
> 2011-11-17 22:56:42,098 INFO org.apache.zookeeper.ClientCnxn: Opening 
> socket connection to server asterix-010/10.0.0.10:22181 
> <http://10.0.0.10:22181>
> 2011-11-17 22:56:42,099 INFO org.apache.zookeeper.ClientCnxn: Socket 
> connection established to asterix-010/10.0.0.10:22181 
> <http://10.0.0.10:22181>, initiating session
> 2011-11-17 22:56:42,123 INFO org.apache.zookeeper.ClientCnxn: Session 
> establishment complete on server asterix-010/10.0.0.10:22181 
> <http://10.0.0.10:22181>, sessionid = 0x133b57675b60000, negotiated 
> timeout = 60000
> 2011-11-17 22:56:42,125 INFO org.apache.giraph.graph.BspService: 
> process: Asynchronous connection complete.
> 2011-11-17 22:56:42,126 INFO org.apache.giraph.graph.GraphMapper: map: 
> No need to do anything when not a worker
> 2011-11-17 22:56:42,126 INFO org.apache.giraph.graph.GraphMapper: 
> cleanup: Starting for MASTER_ZOOKEEPER_ONLY2011-11-17 22:56:42,197 
> INFO org.apache.giraph.graph.BspServiceMaster: becomeMaster: First 
> child is 
> '/_hadoopBsp/job_201111172247_0003/_masterElectionDir/asterix-010_00000000000' 
> and my bid is 
> '/_hadoopBsp/job_201111172247_0003/_masterElectionDir/asterix-010_00000000000'
> 2011-11-17 22:56:42,197 INFO org.apache.giraph.graph.BspServiceMaster: 
> becomeMaster: I am now the master!
> 2011-11-17 22:56:42,208 INFO org.apache.giraph.graph.BspService: 
> process: applicationAttemptChanged signaled
> 2011-11-17 22:56:42,216 WARN org.apache.giraph.graph.BspService: 
> process: Unknown and unprocessed event 
> (path=/_hadoopBsp/job_201111172247_0003/_applicationAttemptsDir/0/_superstepDir, 
> type=NodeChildrenChanged, state=SyncConnected)
> 2011-11-17 22:56:45,130 INFO 
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat: Total input 
> paths to process : 10
> 2011-11-17 22:56:45,227 INFO org.apache.giraph.graph.BspServiceMaster: 
> coordinateSuperstep: 0 out of 10 chosen workers finished on superstep -1
> 2011-11-17 23:01:20,045 ERROR org.apache.zookeeper.ClientCnxn: Error 
> while calling watcher
> java.lang.RuntimeException: 
> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode 
> = NoNode for 
> /_hadoopBsp/job_201111172247_0003/_applicationAttemptsDir/0/_superstepDir/-1/_vertexRangeAssignments
>         at 
> org.apache.giraph.graph.BspService.getVertexRangeMap(BspService.java:885)
>         at 
> org.apache.giraph.graph.BspServiceMaster.checkHealthyWorkerFailure(BspServiceMaster.java:1946)
>         at 
> org.apache.giraph.graph.BspServiceMaster.processEvent(BspServiceMaster.java:1976)
>         at 
> org.apache.giraph.graph.BspService.process(BspService.java:1095)
>         at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:488)
> Caused by: org.apache.zookeeper.KeeperException$NoNodeException: 
> KeeperErrorCode = NoNode for 
> /_hadoopBsp/job_201111172247_0003/_applicationAttemptsDir/0/_superstepDir/-1/_vertexRangeAssignments
>         at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
>         at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>         at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:921)
>         at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:950)
>         at 
> org.apache.giraph.graph.BspService.getVertexRangeMap(BspService.java:858)
>         ... 4 more2011-11-17 23:01:22,009 INFO 
> org.apache.giraph.graph.BspServiceMaster: coordinateSuperstep: 0 out 
> of 10 chosen workers finished on superstep -12011-11-17 23:11:27,357 
> WARN org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: 
> Forced a shutdown hook kill of the ZooKeeper process.
>
>
> Mapper log on Node-2:
> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client 
> environment:zookeeper.version=3.3.1-942149, built on 05/07/2010 17:14 GMT
> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client 
> environment:host.name <http://host.name>=asterix-001
> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client 
> environment:java.version=1.6.0_21
> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client 
> environment:java.vendor=Sun Microsystems Inc.
> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client 
> environment:java.home=/mnt/data/sda/space/yingyi/tools/java/jre
> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client 
> environment:java.class.path=/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/jars/classes:/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/jars:/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/attempt_201111172247_0003_m_000008_0/work:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../conf:/mnt/data/sda/space/yingyi/tools/java/lib/tools.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../build/classes:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../build:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../build/test/classes:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../build/tools:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/hadoop-core-0.20.205.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/asm-3.2.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/aspectjrt-1.6.5.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/aspectjtools-1.6.5.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-beanutils-1.7.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-beanutils-core-1.8.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-cli-1.2.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-codec-1.4.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-collections-3.2.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-configuration-1.6.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-daemon-1.0.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-digester-1.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-el-1.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-httpclient-3.0.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-lang-2.4.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-logging-1.1.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-logging-api-1.0.4.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-math-2.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-net-1.4.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/core-3.1.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/hadoop-capacity-scheduler-0.20.205.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/hadoop-fairscheduler-0.20.205.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/hadoop-thriftfs-0.20.205.0.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/hsqldb-1.8.0.10.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jackson-core-asl-1.0.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jackson-mapper-asl-1.0.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jasper-compiler-5.5.12.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jasper-runtime-5.5.12.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jdeb-0.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jersey-core-1.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jersey-json-1.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jersey-server-1.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jets3t-0.6.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jetty-6.1.26.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jetty-util-6.1.26.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jsch-0.1.42.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/junit-4.5.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/kfs-0.2.2.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/log4j-1.2.15.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/mockito-all-1.8.5.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/oro-2.0.8.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/servlet-api-2.5-20081211.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/slf4j-api-1.4.3.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/slf4j-log4j12-1.4.3.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/xmlenc-0.52.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jsp-2.1/jsp-2.1.jar:/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jsp-2.1/jsp-api-2.1.jar
> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client 
> environment:java.library.path=/mnt/data/sda/space/yingyi/hadoop-0.20.205.0/libexec/../lib:/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/attempt_201111172247_0003_m_000008_0/work
> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client 
> environment:java.io.tmpdir=/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/attempt_201111172247_0003_m_000008_0/work/tmp
> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client 
> environment:java.compiler=<NA>
> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client 
> environment:os.name <http://os.name>=Linux
> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client 
> environment:os.arch=amd64
> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client 
> environment:os.version=2.6.18-194.26.1.el5
> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client 
> environment:user.name <http://user.name>=yingyib
> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client 
> environment:user.home=/home/yingyib
> 2011-11-17 22:56:44,158 INFO org.apache.zookeeper.ZooKeeper: Client 
> environment:user.dir=/mnt/data/sda/space/yingyi/hdfsdata_giraph/mapred/local/taskTracker/yingyib/jobcache/job_201111172247_0003/attempt_201111172247_0003_m_000008_0/work
> 2011-11-17 22:56:44,159 INFO org.apache.zookeeper.ZooKeeper: 
> Initiating client connection, connectString=asterix-010:22181 
> sessionTimeout=60000 
> watcher=org.apache.giraph.graph.BspServiceWorker@60ded0f0
> 2011-11-17 22:56:44,171 INFO org.apache.zookeeper.ClientCnxn: Opening 
> socket connection to server asterix-010/10.0.0.10:22181 
> <http://10.0.0.10:22181>
> 2011-11-17 22:56:44,173 INFO org.apache.zookeeper.ClientCnxn: Socket 
> connection established to asterix-010/10.0.0.10:22181 
> <http://10.0.0.10:22181>, initiating session
> 2011-11-17 22:56:44,178 INFO org.apache.zookeeper.ClientCnxn: Session 
> establishment complete on server asterix-010/10.0.0.10:22181 
> <http://10.0.0.10:22181>, sessionid = 0x133b57675b60007, negotiated 
> timeout = 60000
> 2011-11-17 22:56:44,180 INFO org.apache.giraph.graph.BspService: 
> process: Asynchronous connection complete.
> 2011-11-17 22:56:44,180 INFO org.apache.giraph.graph.GraphMapper: 
> setup: Registering health of this worker...
> 2011-11-17 22:56:44,191 INFO org.apache.giraph.graph.BspService: 
> getJobState: Job state already exists 
> (/_hadoopBsp/job_201111172247_0003/_masterJobState)
> 2011-11-17 22:56:44,195 INFO org.apache.giraph.graph.BspService: 
> getApplicationAttempt: Node 
> /_hadoopBsp/job_201111172247_0003/_applicationAttemptsDir already exists!
> 2011-11-17 22:56:44,198 INFO org.apache.giraph.graph.BspService: 
> getApplicationAttempt: Node 
> /_hadoopBsp/job_201111172247_0003/_applicationAttemptsDir already exists!
> 2011-11-17 22:56:44,204 INFO org.apache.giraph.graph.BspServiceWorker: 
> registerHealth: Created my health node for attempt=0, superstep=-1 
> with 
> /_hadoopBsp/job_201111172247_0003/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/asterix-001_8

> and hostnamePort = ["asterix-001",30008]
> 2011-11-17 22:56:45,177 INFO org.apache.giraph.graph.BspService: 
> process: inputSplitsReadyChanged (input splits ready)
> 2011-11-17 22:56:45,192 WARN org.apache.giraph.graph.BspService: 
> process: Unknown and unprocessed event 
> (path=/_hadoopBsp/job_201111172247_0003/_inputSplitsDir/2/_inputSplitReserved, 
> type=NodeCreated, state=SyncConnected)
> 2011-11-17 22:56:45,192 INFO org.apache.giraph.graph.BspServiceWorker: 
> reserveInputSplit: Reserved input split path 
> /_hadoopBsp/job_201111172247_0003/_inputSplitsDir/2
> 2011-11-17 22:56:45,196 INFO org.apache.giraph.graph.BspServiceWorker: 
> loadVertices: Reserved 
> /_hadoopBsp/job_201111172247_0003/_inputSplitsDir/2 from ZooKeeper and 
> got input split 
> 'hdfs://asterix-master:31888/webmap-tiny-sorted/part-00002:0+834285620'
> 2011-11-17 23:01:20,608 INFO org.apache.zookeeper.ClientCnxn: Client 
> session timed out, have not heard from server in 59117ms for sessionid 
> 0x133b57675b60007, closing socket connection and attempting reconnect
> 2011-11-17 23:02:06,630 ERROR org.apache.zookeeper.ClientCnxn: Error 
> while calling watcher
> java.lang.RuntimeException: process: Disconnected from ZooKeeper, 
> cannot recover.
>         at org.apache.giraph.graph.BspService.process(BspService.java:990)
>         at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:488)
> 2011-11-17 23:02:35,793 INFO org.apache.zookeeper.ClientCnxn: Opening 
> socket connection to server asterix-010/10.0.0.10:22181 
> <http://10.0.0.10:22181>
> 2011-11-17 23:02:35,794 INFO org.apache.zookeeper.ClientCnxn: Socket 
> connection established to asterix-010/10.0.0.10:22181 
> <http://10.0.0.10:22181>, initiating session
> 2011-11-17 23:02:35,806 INFO org.apache.zookeeper.ClientCnxn: Unable 
> to reconnect to ZooKeeper service, session 0x133b57675b60007 has 
> expired, closing socket connection
>
> On Thu, Nov 17, 2011 at 9:46 PM, Avery Ching <aching@apache.org 
> <mailto:aching@apache.org>> wrote:
>
>     Hi Yingyi,
>
>     Here are some ideas you might want to try:
>
>     1)  Limit the thread stack size.
>
>     2  You can set the heap available to the mapper jvm.
>
>     I.e. Here's a setting to get 10 GB of heap and use a smaller stack
>     (64k) for the threads.
>
>     -Dmapred.child.java.opts="-Xms10g -Xmx10g -Xss64k"
>
>     Also, you might want to try using the EdgeListVertex instead of
>     Vertex (i.e. GiraphJob.setVertexClass(EdgeListVertex.class)), it
>     is quite a bit smaller.
>
>     Let us know if that helps you.  You should also check to see if
>     your Hadoop installation is using a 32-bit of 64-bit JVM.  If it's
>     32-bit you will be limited in how much heap you can use.
>
>     Avery
>
>
>     On 11/17/11 9:38 PM, Yingyi Bu wrote:
>>     Hi,
>>
>>         I'm running a Giraph PageRank job.  I tried with 8GB input
>>     text data over 10 nodes (each has 4 core,  4 disks,  and 12GB
>>     physical memory),  that is 800MB input-data/machine.    However,
>>      Giraph job fails because of high GC costs and Out-of-Memory
>>     exception.
>>         Do I set some special things in Hadoop configurations, for
>>     example,  maximum heap size for map task vm ?
>>         Thanks!!
>>
>>     Best regards,
>>     Yingyi
>
>


Mime
View raw message