Yes, before conversion they were also sorte= d by vertex ID.
0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 1=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 1
0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 4=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 1
0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 5=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 1
1=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 1
1=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0 2=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 1
1=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0 3=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 1
2=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 1=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 1
2=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 6=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= 1
2=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 7=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 1
3=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 8=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0 1

That was the format before conversion with form= at [source destination weight] of edges.
I'm using a medium-ti= er EC2 instance c3.2xlarge.=C2=A0 It has 8 vCPUs and 15 Gi of memory accord= ing to Amazon's website.

Any suggestions on tweaking Xmx?=C2=A0 I tried doing that without= any luck, it just didn't run the map phase then failed.

On Thu, Jul 3, 201= 4 at 1:57 PM, Young Han wrote:=
From the other thread...

Yeah, your input format looks correct. Did you have the graph sor= ted by=20 source vertex IDs before conversion? (I'm not sure if duplicate entries= =20 with the same source ID matters, but just in case.)

They're all out of memory errors, so I think Xmx is the culprit. What= =20 type of EC2 instances are you using? You probably want to use something=20 larger than t1.micro or m1.small.

Young

On Thu, Jul 3= , 2014 at 4:53 PM, Bryan Rowe wr= ote:
First = of all, I started this email thread with my old email from Yahoo, which was= a mistake because it kept sending out duplicates.=C2=A0 Sorry for the inco= nvenience, but I'll continue it using this email thread from now on.
I originally posted this:
Hello,

Giraph: release-1.0.0-RC3

In short, when I use large graphs with the Sh= ortest Paths example, it fails.=C2=A0 But when I use the small graph provided on the Quick Start gu= ide, it succeeds.
I converted all of my large graphs into the format shown in the Quick= Start guide to simply things.
I'm using a one-node setup.

Here is the command I'm using:
org.apache.giraph.GiraphRunner org.apache.giraph.examples.SimpleShort= estPathsVertex
-vif org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat
-vip /user/ubuntu/input/CA.txt -of org.apache.giraph.io.formats.IdWi= thValueTextOutputFormat
-op /user/ubuntu/output/shortestpaths
-w 1

(all on one line)

CA.txt is a large graph file: 96,026,228 bytes

The job fails in 10mins, 46sec.

Two Map tasks are created when run.
The first one, task_201407021636_0006_m_000000, is KILLED.=C2=A0 sysl= og:
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D
```2014-07-02 17:01:34,757 INFO org.apache.h=
2014-07-02 17:01:34,945 WARN org.apache.hadoop.metrics2.impl.MetricsSystemI=
mpl: Source name ugi already exists!
2014-07-02 17:01:35,127 INFO org.apache.giraph.graph.GraphTaskManager: setu=
p: Log level remains at info
2014-07-02 17:01:35,159 INFO org.apache.giraph.graph.GraphTaskManager: Dist=
ributed cache is empty. Assuming fatjar.
2014-07-02 17:01:35,159 INFO org.apache.giraph.graph.GraphTaskManager: setu=
p: classpath @ /home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcach=
e/job_201407021636_0006/jars/job.jar for job Giraph: org.apache.giraph.exam=
ples.SimpleShortestPathsVertex
2014-07-02 17:01:35,201 INFO org.apache.giraph.zk.ZooKeeperManager: createC=
andidateStamp: Made the directory _bsp/_defaultZkManagerDir/job_20140702163=
6_0006
2014-07-02 17:01:35,204 INFO org.apache.giraph.zk.ZooKeeperManager: createC=
andidateStamp: Creating my filestamp _bsp/_defaultZkManagerDir/job_20140702=
0
2014-07-02 17:01:35,219 INFO org.apache.giraph.zk.ZooKeeperManager: getZooK=
eeperServerList: Got [ec2-54-186-5-159.us-west-2.compute.amazona=
ws.com] 1 hosts from 1 candidates when 1 required (polling period is 30=
00) on attempt 0
2014-07-02 17:01:35,220 INFO org.apache.giraph.zk.ZooKeeperManager: createZ=
ooKeeperServerList: Creating the final ZooKeeper file '_bsp/_defaultZkM=
anagerDir/job_201407021636_0006/zkServerList_ec2-54=
-186-5-159.us-west-2.compute.amazonaws.com 0 '
2014-07-02 17:01:35,228 INFO org.apache.giraph.zk.ZooKeeperManager: getZooK=
eeperServerList: For task 0, got file 'zkServer=
List_ec2-54-186-5-159.us-west-2.compute.amazonaws.com 0 ' (polling =
period is 3000)
2014-07-02 17:01:35,228 INFO org.apache.giraph.zk.ZooKeeperManager: getZooK=
eeperServerList: Found [ec2-54-186-5-159.us-west-2.compute.amazo=
naws.com, 0] 2 hosts in filename 'zkServerL=
ist_ec2-54-186-5-159.us-west-2.compute.amazonaws.com 0 '
2014-07-02 17:01:35,229 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZ=
ooKeeperServers: Trying to delete old directory /home/ubuntu/hdfstmp/mapred=
2014-07-02 17:01:35,234 INFO org.apache.giraph.zk.ZooKeeperManager: generat=
eZooKeeperConfigFile: Creating file /home/ubuntu/hdfstmp/mapred/local/taskT=
racker/ubuntu/jobcache/job_201407021636_0006/work/_bspZooKeeper/zoo.cfg in =
636_0006/work/_bspZooKeeper with base port 22181
2014-07-02 17:01:35,234 INFO org.apache.giraph.zk.ZooKeeperManager: generat=
eZooKeeperConfigFile: Make directory of _bspZooKeeper =3D true
2014-07-02 17:01:35,235 INFO org.apache.giraph.zk.ZooKeeperManager: generat=
eZooKeeperConfigFile: Delete of zoo.cfg =3D false
2014-07-02 17:01:35,236 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZ=
ooKeeperServers: Attempting to start ZooKeeper server with command [/usr/li=
b/jvm/java-7-oracle/jre/bin/java, -Xmx512m, -XX:ParallelGCThreads=3D4, -XX:=
+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=3D70, -XX:MaxGCPaus=
bcache/job_201407021636_0006/jars/job.jar, org.apache.zookeeper.server.quor=
cache/job_201407021636_0006/work/_bspZooKeeper/zoo.cfg] in directory /home/=
06/work/_bspZooKeeper
2014-07-02 17:01:35,238 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZ=
ooKeeperServers: Shutdown hook added.
2014-07-02 17:01:35,238 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZ=
ooKeeperServers: Connect attempt 0 of 10 max trying to connect to ec2-54-186-5-159.us-west-2.compute.amazonaws.com:22181 with=
poll msecs =3D 3000
2014-07-02 17:01:35,241 WARN org.apache.giraph.zk.ZooKeeperManager: onlineZ=
ooKeeperServers: Got ConnectException
java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java=
:339)
pl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:1=
82)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at org.apache.giraph.zk.ZooKeeperManager.onlineZooKeeperServers(ZooKeeperM=
anager.java:701)
kManager.java:357)
8)
at org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:60)
at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:90)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
on.java:1059)
2014-07-02 17:01:38,242 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZ=
ooKeeperServers: Connect attempt 1 of 10 max trying to connect to ec2-54-186-5-159.us-west-2.compute.amazonaws.com:22181 with=
poll msecs =3D 3000
2014-07-02 17:01:38,243 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZ=
ooKeeperServers: Connected to ec2-54-186-5-15=
9.us-west-2.compute.amazonaws.com/172.31.45.24:22181!
2014-07-02 17:01:38,243 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZ=
ooKeeperServers: Creating my filestamp _bsp/_defaultZkManagerDir/job_201407=
021636_0006/_zkServer/ec2-54-186-5-159.us-west-2.compute.amazona=
ws.com 0
2014-07-02 17:01:40,249 INFO org.apache.giraph.graph.GraphTaskManager: setu=
p: Chosen to run ZooKeeper...
2014-07-02 17:01:40,249 INFO org.apache.giraph.graph.GraphTaskManager: setu=
p: Starting up BspServiceMaster (master thread)...
2014-07-02 17:01:40,260 INFO org.apache.giraph.bsp.BspService: BspService: =
Connecting to ZooKeeper with job job_201407021636_0006, 0 on ec2-54-186-5-159.us-west-2.compute.amazonaws.com:22181
2014-07-02 17:01:40,270 INFO org.apache.zookeeper.ZooKeeper: Client environ=
ment:zookeeper.version=3D3.3.3-1073969, built on 02/23/2011 22:27 GMT
2014-07-02 17:01:40,270 INFO org.apache.zookeeper.ZooKeeper: Client environ=
ment:host.name=3Dec2-54-186-5-159.us-west-2.compute.amazonaws.com
2014-07-02 17:01:40,270 INFO org.apache.zookeeper.ZooKeeper: Client environ=
ment:java.version=3D1.7.0_60
2014-07-02 17:01:40,270 INFO org.apache.zookeeper.ZooKeeper: Client environ=
ment:java.vendor=3DOracle Corporation
2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environ=
ment:java.home=3D/usr/lib/jvm/java-7-oracle/jre
2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client
r/ubuntu/jobcache/job_201407021636_0006/jars/classes:/home/ubuntu/hdfstmp/m=
2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environ=
021636_0006/attempt_201407021636_0006_m_000000_0/work
2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environ=
jobcache/job_201407021636_0006/attempt_201407021636_0006_m_000000_0/work/tm=
p
2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environ=
ment:java.compiler=3D<NA>
2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environ=
ment:os.name=3DLinux
2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environ=
ment:os.arch=3Damd64
2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environ=
ment:os.version=3D3.2.0-58-virtual
2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environ=
ment:user.name=3Dubuntu
2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environ=
ment:user.home=3D/home/ubuntu
2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environ=
he/job_201407021636_0006/attempt_201407021636_0006_m_000000_0/work
2014-07-02 17:01:40,272 INFO org.apache.zookeeper.ZooKeeper: Initiating cli=
ent connection, connectString=3Dec2-54-186-5-159.us-west-2=
.compute.amazonaws.com:22181 sessionTimeout=3D60000 watcher=3Dorg.apach=
e.giraph.master.BspServiceMaster@4deb9df0
2014-07-02 17:01:40,284 INFO org.apache.zookeeper.ClientCnxn: Opening socke=
t connection to server ec2-54-186-5=
-159.us-west-2.compute.amazonaws.com/172.31.45.24:22181
2014-07-02 17:01:40,296 INFO org.apache.zookeeper.ClientCnxn: Socket co=
nnection established to ec2-54-186-5-159.us-w=
est-2.compute.amazonaws.com/172.31.45.24:22181, initiating session
2014-07-02 17:01:42,336 INFO org.apache.zookeeper.ClientCnxn: Session estab=
lishment complete on server ec2-54-186-5-159.=
us-west-2.compute.amazonaws.com/172.31.45.24:22181, sessionid =3D 0x146=
f806378d0000, negotiated timeout =3D 600000
2014-07-02 17:01:42,337 INFO org.apache.giraph.bsp.BspService: process: Asy=
nchronous connection complete.
2014-07-02 17:01:42,343 INFO org.apache.giraph.graph.GraphTaskManager: map:=
No need to do anything when not a worker
2014-07-02 17:01:42,343 INFO org.apache.giraph.graph.GraphTaskManager: clea=
nup: Starting for MASTER_ZOOKEEPER_ONLY
2014-07-02 17:01:45,033 INFO org.apache.giraph.bsp.BspService: getJobState:=
e)
2014-07-02 17:01:45,038 INFO org.apache.giraph.master.BspServiceMaster: bec=
omeMaster: First child is '/_hadoopBsp/job_201407021636_0006/_masterEle=
ctionDir/ec2-54-186-5-159.us-west-2.compute.amazonaws.com_00000000000' =
and my bid is '/_hadoopBsp/job_201407021636_0006/_masterElectionDir/ec2=
-54-186-5-159.us-west-2.compute.amazonaws.com_00000000000'
2014-07-02 17:01:45,044 INFO org.apache.giraph.bsp.BspService: getApplicati=
onAttempt: Node /_hadoopBsp/job_201407021636_0006/_applicationAttemptsDir a=
2014-07-02 17:01:45,049 INFO org.apache.giraph.bsp.BspService: process: app=
licationAttemptChanged signaled
2014-07-02 17:01:45,089 INFO org.apache.giraph.comm.netty.NettyServer: Nett=
yServer: Using execution handler with 8 threads after requestFrameDecoder.
2014-07-02 17:01:45,111 INFO org.apache.giraph.comm.netty.NettyServer: star=
t: Started server communication server: ec2-5=
4-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:30000 with up =
to 16 threads on bind attempt 0 with sendBufferSize =3D 32768 receiveBuffer=
Size =3D 524288 backlog =3D 1
2014-07-02 17:01:45,116 INFO org.apache.giraph.comm.netty.NettyClient: Nett=
yClient: Using execution handler with 8 threads after requestEncoder.
2014-07-02 17:01:45,119 INFO org.apache.giraph.master.BspServiceMaster: bec=
omeMaster: I am now the master!
2014-07-02 17:01:45,126 INFO org.apache.giraph.bsp.BspService: getApplicati=
onAttempt: Node /_hadoopBsp/job_201407021636_0006/_applicationAttemptsDir a=
2014-07-02 17:01:45,140 INFO org.apache.giraph.io.formats.GiraphFileInputFo=
rmat: Total input paths to process : 1
2014-07-02 17:01:45,149 INFO org.apache.giraph.master.BspServiceMaster: gen=
erateVertexInputSplits: Got 2 input splits for 1 input threads
2014-07-02 17:01:45,149 INFO org.apache.giraph.master.BspServiceMaster: cre=
ateVertexInputSplits: Starting to write input split data to zookeeper with =
2014-07-02 17:01:45,163 INFO org.apache.giraph.master.BspServiceMaster: cre=
ateVertexInputSplits: Done writing input split data to zookeeper
2014-07-02 17:01:45,173 INFO org.apache.giraph.comm.netty.NettyClient: Usin=
g Netty without authentication.
2014-07-02 17:01:45,187 INFO org.apache.giraph.comm.netty.NettyClient: conn=
ectAllAddresses: Successfully added 1 connections, (1 total connected) 0 fa=
iled, 0 failures total.
2014-07-02 17:01:45,188 INFO org.apache.giraph.partition.PartitionUtils: co=
mputePartitionCount: Creating 1, default would have been 1 partitions.
2014-07-02 17:01:45,350 INFO org.apache.giraph.comm.netty.NettyServer: star=
t: Using Netty without authentication.
2014-07-02 17:06:45,347 INFO org.apache.giraph.master.BspServiceMaster: bar=
rierOnWorkerList: 0 out of 1 workers finished on superstep -1 on path /_had=
oopBsp/job_201407021636_0006/_vertexInputSplitDoneDir
2014-07-02 17:06:45,349 INFO org.apache.giraph.master.BspServiceMaster: bar=
rierOnWorkerList: Waiting on [ec2-54-186-5-159.us-west-2.compute.amazonaws.=
com_1]
2014-07-02 17:11:45,355 INFO org.apache.giraph.master.BspServiceMaster: bar=
rierOnWorkerList: 0 out of 1 workers finished on superstep -1 on path /_had=
oopBsp/job_201407021636_0006/_vertexInputSplitDoneDir
2014-07-02 17:11:45,355 INFO org.apache.giraph.master.BspServiceMaster: bar=
rierOnWorkerList: Waiting on [ec2-54-186-5-159.us-west-2.compute.amazonaws.=
com_1]
2014-07-02 17:12:07,201 INFO org.apache.giraph.zk.ZooKeeperManager: run: Sh=
utdown hook started.
2014-07-02 17:12:07,202 WARN org.apache.giraph.zk.ZooKeeperManager: onlineZ=
ooKeeperServers: Forced a shutdown hook kill of the ZooKeeper process.
2014-07-02 17:12:07,518 INFO org.apache.zookeeper.ClientCnxn: Unable to rea=
d additional data from server sessionid 0x146f806378d0000, likely server ha=
s closed socket, closing socket connection and attempting reconnect
2014-07-02 17:12:07,518 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZ=
ooKeeperServers: ZooKeeper process exited with 143 (note that 143 typically=
means killed).=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D

```

The second one, task_201407021636_0006_m_000001, goes to the FAILED s= tate.=C2=A0 syslog:
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D
```2014-07-02 17:01:38,016 INFO org.apache.hadoo=
2014-07-02 17:01:38,203 WARN org.apache.hadoop.metrics2.impl.MetricsSystemI=
mpl: Source name ugi already exists!
2014-07-02 17:01:38,379 INFO org.apache.giraph.graph.GraphTaskManager: setu=
p: Log level remains at info
2014-07-02 17:01:40,280 INFO org.apache.giraph.graph.GraphTaskManager: Dist=
ributed cache is empty. Assuming fatjar.
2014-07-02 17:01:40,281 INFO org.apache.giraph.graph.GraphTaskManager: setu=
p: classpath @ /home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcach=
e/job_201407021636_0006/jars/job.jar for job Giraph: org.apache.giraph.exam=
ples.SimpleShortestPathsVertex
2014-07-02 17:01:40,317 INFO org.apache.giraph.zk.ZooKeeperManager: createC=
andidateStamp: Made the directory _bsp/_defaultZkManagerDir/job_20140702163=
6_0006
2014-07-02 17:01:40,320 INFO org.apache.giraph.zk.ZooKeeperManager: createC=
andidateStamp: Creating my filestamp _bsp/_defaultZkManagerDir/job_20140702=
1
2014-07-02 17:01:42,109 INFO org.apache.giraph.zk.ZooKeeperManager: getZooK=
eeperServerList: For task 1, got file 'zkServer=
List_ec2-54-186-5-159.us-west-2.compute.amazonaws.com 0 ' (polling =
period is 3000)
2014-07-02 17:01:42,313 INFO org.apache.giraph.zk.ZooKeeperManager: getZooK=
eeperServerList: Found [ec2-54-186-5-159.us-west-2.compute.amazo=
naws.com, 0] 2 hosts in filename 'zkServerL=
ist_ec2-54-186-5-159.us-west-2.compute.amazonaws.com 0 '
2014-07-02 17:01:42,316 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZ=
ooKeeperServers: Got [ec2-54-186-5-159.us-west-2.compute.amazona=
ws.com] 1 hosts from 1 ready servers when 1 required (polling period is=
3000) on attempt 0
2014-07-02 17:01:42,316 INFO org.apache.giraph.graph.GraphTaskManager: setu=
p: Starting up BspServiceWorker...
2014-07-02 17:01:42,330 INFO org.apache.giraph.bsp.BspService: BspService: =
Connecting to ZooKeeper with job job_201407021636_0006, 1 on ec2-54-186-5-159.us-west-2.compute.amazonaws.com:22181
2014-07-02 17:01:42,337 INFO org.apache.zookeeper.ZooKeeper: Client environ=
ment:zookeeper.version=3D3.3.3-1073969, built on 02/23/2011 22:27 GMT
2014-07-02 17:01:42,337 INFO org.apache.zookeeper.ZooKeeper: Client environ=
ment:host.name=3Dec2-54-186-5-159.us-west-2.compute.amazonaws.com
2014-07-02 17:01:42,337 INFO org.apache.zookeeper.ZooKeeper: Client environ=
ment:java.version=3D1.7.0_60
2014-07-02 17:01:42,337 INFO org.apache.zookeeper.ZooKeeper: Client environ=
ment:java.vendor=3DOracle Corporation
2014-07-02 17:01:42,337 INFO org.apache.zookeeper.ZooKeeper: Client environ=
ment:java.home=3D/usr/lib/jvm/java-7-oracle/jre
2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client
r/ubuntu/jobcache/job_201407021636_0006/jars/classes:/home/ubuntu/hdfstmp/m=
2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environ=
021636_0006/attempt_201407021636_0006_m_000001_0/work
2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environ=
jobcache/job_201407021636_0006/attempt_201407021636_0006_m_000001_0/work/tm=
p
2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environ=
ment:java.compiler=3D<NA>
2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environ=
ment:os.name=3DLinux
2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environ=
ment:os.arch=3Damd64
2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environ=
ment:os.version=3D3.2.0-58-virtual
2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environ=
ment:user.name=3Dubuntu
2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environ=
ment:user.home=3D/home/ubuntu
2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environ=
he/job_201407021636_0006/attempt_201407021636_0006_m_000001_0/work
2014-07-02 17:01:42,339 INFO org.apache.zookeeper.ZooKeeper: Initiating cli=
ent connection, connectString=3Dec2-54-186-5-159.us-west-2=
.compute.amazonaws.com:22181 sessionTimeout=3D60000 watcher=3Dorg.apach=
e.giraph.worker.BspServiceWorker@54a5733f
2014-07-02 17:01:42,347 INFO org.apache.zookeeper.ClientCnxn: Opening socke=
t connection to server ec2-54-186-5=
-159.us-west-2.compute.amazonaws.com/172.31.45.24:22181
2014-07-02 17:01:42,347 INFO org.apache.zookeeper.ClientCnxn: Socket co=
nnection established to ec2-54-186-5-159.us-w=
est-2.compute.amazonaws.com/172.31.45.24:22181, initiating session
2014-07-02 17:01:42,812 INFO org.apache.zookeeper.ClientCnxn: Session estab=
lishment complete on server ec2-54-186-5-159.=
us-west-2.compute.amazonaws.com/172.31.45.24:22181, sessionid =3D 0x146=
f806378d0001, negotiated timeout =3D 600000
2014-07-02 17:01:42,814 INFO org.apache.giraph.bsp.BspService: process: Asy=
nchronous connection complete.
2014-07-02 17:01:42,819 INFO org.apache.giraph.comm.netty.NettyWorkerServer=
: createMessageStoreFactory: Using ByteArrayMessagesPerVertexStore since th=
ere is no combiner
2014-07-02 17:01:42,881 INFO org.apache.giraph.comm.netty.NettyServer: Nett=
yServer: Using execution handler with 8 threads after requestFrameDecoder.
2014-07-02 17:01:42,904 INFO org.apache.giraph.comm.netty.NettyServer: star=
t: Started server communication server: ec2-5=
4-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:30001 with up =
to 16 threads on bind attempt 0 with sendBufferSize =3D 32768 receiveBuffer=
Size =3D 524288 backlog =3D 1
2014-07-02 17:01:42,907 INFO org.apache.giraph.comm.netty.NettyClient: Nett=
yClient: Using execution handler with 8 threads after requestEncoder.
2014-07-02 17:01:42,914 INFO org.apache.giraph.graph.GraphTaskManager: setu=
p: Registering health of this worker...
2014-07-02 17:01:45,049 INFO org.apache.giraph.bsp.BspService: process: app=
licationAttemptChanged signaled
2014-07-02 17:01:45,068 WARN org.apache.giraph.bsp.BspService: process: Unk=
nown and unprocessed event (path=3D/_hadoopBsp/job_201407021636_0006/_appli=
cationAttemptsDir/0/_superstepDir, type=3DNodeChildrenChanged, state=3DSync=
Connected)
2014-07-02 17:01:45,076 INFO org.apache.giraph.worker.BspServiceWorker: reg=
isterHealth: Created my health node for attempt=3D0, superstep=3D-1 with /_=
_workerHealthyDir/ec2-54-186-5-159.us-west-2.compute.amazonaws.com_1 and wo=
rkerInfo=3D Worker(hostname=3Dec2-54-186-5-159.us-west-2.compute=
2014-07-02 17:01:45,183 INFO org.apache.giraph.comm.netty.NettyServer: star=
t: Using Netty without authentication.
2014-07-02 17:01:45,339 INFO org.apache.giraph.bsp.BspService: process: par=
titionAssignmentsReadyChanged (partitions are assigned)
2014-07-02 17:01:45,345 INFO org.apache.giraph.worker.BspServiceWorker: sta=
rtSuperstep: Master(hostname=3Dec2-54-186-5-159.us-west-2.comput=
2014-07-02 17:01:45,345 INFO org.apache.giraph.worker.BspServiceWorker: sta=
rtSuperstep: Ready for computation on superstep -1 since worker selection a=
nd vertex range assignments are done in /_hadoopBsp/job_201407021636_0006/_=
2014-07-02 17:01:45,346 INFO org.apache.giraph.comm.netty.NettyClient: Usin=
g Netty without authentication.
2014-07-02 17:01:45,354 INFO org.apache.giraph.comm.netty.NettyClient: conn=
ectAllAddresses: Successfully added 1 connections, (1 total connected) 0 fa=
iled, 0 failures total.
2014-07-02 17:01:45,359 INFO org.apache.giraph.worker.BspServiceWorker: loa=
dInputSplits: Using 1 thread(s), originally 1 threads(s) for 2 total splits=
.
2014-07-02 17:01:45,362 INFO org.apache.giraph.comm.SendPartitionCache: Sen=
dPartitionCache: maxVerticesPerTransfer =3D 10000
2014-07-02 17:01:45,362 INFO org.apache.giraph.comm.SendPartitionCache: Sen=
dPartitionCache: maxEdgesPerTransfer =3D 80000
2014-07-02 17:01:45,372 INFO org.apache.giraph.worker.InputSplitsHandler: r=
eserveInputSplit: Reserved input split path /_hadoopBsp/job_201407021636_00=
06/_vertexInputSplitDir/0, overall roughly 0.0% input splits reserved
2014-07-02 17:01:45,373 INFO org.apache.giraph.worker.InputSplitsCallable: =
Dir/0 from ZooKeeper and got input split 'hdfs://ec2-54-186-5-159.us-west-2.compute.amazonaws=
.com:54310/user/ubuntu/input/CA.txt:0+67108864'
2014-07-02 17:01:46,615 INFO org.apache.giraph.worker.VertexInputSplitsCall=
able: readVertexInputSplit: Loaded 250000 vertices at 200412.01085983627 ve=
rtices/sec 697447 edges at 560085.1663971642 edges/sec Memory (free/total/m=
ax) =3D 112.94M / 182.50M / 182.50M
2014-07-02 17:01:47,440 INFO org.apache.giraph.worker.VertexInputSplitsCall=
able: readVertexInputSplit: Loaded 500000 vertices at 241131.36490688394 ve=
rtices/sec 1419367 edges at 685221.3493060106 edges/sec Memory (free/total/=
max) =3D 45.07M / 187.50M / 187.50M
2014-07-02 17:01:51,322 INFO org.apache.giraph.worker.VertexInputSplitsCall=
able: readVertexInputSplit: Loaded 750000 vertices at 125921.72750649283 ve=
rtices/sec 2149814 edges at 361077.323111284 edges/sec Memory (free/total/m=
ax) =3D 16.73M / 189.50M / 189.50M
2014-07-02 17:01:55,205 ERROR org.apache.giraph.worker.BspServiceWorker: un=
registerHealth: Got failure, unregistering health on /_hadoopBsp/job_201407=
021636_0006/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/ec=
2-54-186-5-159.us-west-2.compute.amazonaws.com_1 on superstep -1
itializing logs' truncater with mapRetainSize=3D-1 and reduceRetainSize=
=3D-1
2014-07-02 17:01:55,716 INFO org.apache.hadoop.io.nativeio.NativeIO: Initia=
lized cache for UID to User mapping with a cache timeout of 14400 seconds.
2014-07-02 17:01:55,717 INFO org.apache.hadoop.io.nativeio.NativeIO: Got Us=
erName ubuntu for UID 1000 from the native implementation
2014-07-02 17:01:55,718 WARN org.apache.hadoop.mapred.Child: Error running =
child
java.lang.IllegalStateException: run: Caught an unrecoverable exception wai=
tFor: ExecutionException occurred while waiting for org.apache.giraph.utils=
.ProgressableUtils\$FutureWaitable@36df8f5
at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:102)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
on.java:1059)
Caused by: java.lang.IllegalStateException: waitFor: ExecutionException occ=
urred while waiting for org.apache.giraph.utils.ProgressableUtils\$FutureWai=
table@36df8f5
at org.apache.giraph.utils.ProgressableUtils.waitFor(ProgressableUtils.jav=
a:151)
at org.apache.giraph.utils.ProgressableUtils.waitForever(ProgressableUtils=
.java:111)
at org.apache.giraph.utils.ProgressableUtils.getFutureResult(ProgressableU=
tils.java:73)
at org.apache.giraph.utils.ProgressableUtils.getResultsWithNCallables(Prog=
ressableUtils.java:192)
ker.java:276)
.java:323)
at org.apache.giraph.worker.BspServiceWorker.setup(BspServiceWorker.java:5=
06)
230)
at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:92)
... 7 more
Caused by: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryEr=
ror: Java heap space
at org.apache.giraph.utils.ProgressableUtils\$FutureWaitable.waitFor(Progre=
ssableUtils.java:271)
at org.apache.giraph.utils.ProgressableUtils.waitFor(ProgressableUtils.jav=
a:143)
... 15 more
Caused by: java.lang.OutOfMemoryError: Java heap space
at java.util.concurrent.ConcurrentHashMap\$Segment.rehash(ConcurrentHashMap=
.java:501)
at java.util.concurrent.ConcurrentHashMap\$Segment.put(ConcurrentHashMap.ja=
va:460)
at java.util.concurrent.ConcurrentHashMap.put(ConcurrentHashMap.java:1130)
n.java:87)
titionStore.java:71)
at org.apache.giraph.comm.requests.SendVertexRequest.doRequest(SendVertexR=
equest.java:81)
at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.doReques=
t(NettyWorkerClientRequestProcessor.java:470)
at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.sendPart=
itionRequest(NettyWorkerClientRequestProcessor.java:203)
at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.sendVert=
exRequest(NettyWorkerClientRequestProcessor.java:267)
xInputSplitsCallable.java:140)
Callable.java:220)
at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.j=
ava:161)
at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.j=
ava:58)
at org.apache.giraph.utils.LogStacktraceCallable.call(LogStacktraceCallabl=
e.java:51)
va:1145)
ava:615)
p for the task```
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D

I've tried increasing Java Heap= Space in hadoop/conf/mapred-site.xml by adding this:
=C2=A0 <property>
=C2=A0=C2=A0=C2=A0 <name>= mapred.child.java.opts</name>
=C2=A0=C2=A0=C2=A0 <value>-Xmx1024m</value>
=C2=A0 </property>

But that just caused the entire job to fail from the start.

Before using this version of Giraph, I used 1.0.0 and 1.1.0-RC0 and those=20 versions provide me with more and different errors to debug that relate=20 to problems with Hadoop itself.=C2=A0 So the Giraph version I'm current= ly=20 using seems to be the best for me because these errors seem more=20 manageable.

What can I do to fix this error?=C2=A0 I thought Giraph was built for large sca= le graph processing so I suppose this problem was encountered before by=20 someone testing large graphs.=C2=A0 I searched through the mailing archives and couldn't find anything though.=

I can provide more information if you need it.=C2=A0 Thanks a lot.

Bryan Rowe

<= u>Then Young Han replied:

10 minutes seems way= too long to load in 91mb from HDFS. Are you sure your graph's format is correct= ? For the Json input formats, each line of your file should be:

[vertex id, vertex value, [[dst= id, edge weight], [dst id, edge weigth], ..., [dst id, edge weight]]]

You can set the vertex value for every vertex to 0 (SSSP will overwrite=20 that value in the first superstep). If your graph doesn't have edge=20 weights, you can just set them to 1.

<= /font>
Also, have you tried a larger Xmx value? E.g., 4096m or 8192m.

Young

<= font>Then I replied:
Hi Young,

I believe my graph= has the correct format.
Here is the first 40 lines of th= e graph I'm using:
[0,0,[[1,1],[4,1],[5,1]]]
[1,0,[[0,1],[2,1],[3,1]]]
[2,0,[[1,1],[6,1],[7,1]]]
[3,0,[[8,1],[1,1],= [9,1]]]
[4,0,[[0,1],[10,1],[11,1]]]
[5,0,[[0,1]]]
[6,0,[[2,1],[12,1]]]
[7,0= ,[[2,1],[12,1],[13,1]]]
[8,0,[[11,1],[3,1],[60,1]]]
[9,0,[[3,1]]]
[10,0,[[35,1],[4,1],[38,1]]]
[11,0,[[8,1],[59,1],[4,1]]]<= br clear=3D"none">[12,0,[[41,1],[6,1],[7,1]]]
[13,0,[[89,= 1],[90,1],[91,1],[7,1]]]
[14,0,[[18,1],[19,1],[15,1]]]
[15,0,[[16,1],[17,1],[14,1]= ]]
[16,0,[[20,1],[21,1],[22,1],[15,1]]]
[17,0,[[24,1],[23,1],[21,1],[15,1]]]
[18,0,[[24,1],[14,1]]]
[19,0,[[25,1],[22,1],[14,1]]]
[20,0,[[16,1],[25,1],[26,1]]]
[21,0,[[16,1]= ,[17,1],[30,1]]]
[22,0,[[16,1],[19,1]]]
[23,0,[[17,1],[105,1]]]
[24,0,[[17,1],[18,1],[58,1]]]
[25,0,[[27,1],[19,1],[20,1]]]
[26,0,[[28,1= ],[27,1],[20,1]]]
[27,0,[[25,1],[26,1],[29,1]]]
[28,0,[[26,1],[29,1],[30,1]]]
[29,0,[[27,1],[28,1],[31,1]= ]]
[30,0,[[32,1],[33,1],[28,1],[21,1]]]
[31,0,[[34,1],[29,1]]]
[32,0,[[105,1],[30,1],[39,1]]] [33,0,[[30,1]]]
[34,0,[[38,1],[31,1]]]
= [35,0,[[10,1],[36,1],[37,1]]]
[36,0,[[40,1],[35,1],[39,1]= ]]
[37,0,[[41,1],[35,1]]]
[38,0,[[10,1],[34,1]]]
[39,0,[[32,1],[58,1],[36,1],[119,1= ]]]
[40,0,[[90,1],[36,1]]]

Also, sorry about sending the email twice.=C2=A0 My email client mes= sed up.

Thanks,
Bryan

I've tried a larger graph.=C2=A0 It ran for 2 hours then failed = with different error messages I believe.

Bryan

--001a11c3f90a8946fc04fd50676c--