giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Suijian Zhou <suijian.z...@gmail.com>
Subject Re: loading graph stuck.
Date Fri, 04 Apr 2014 15:58:00 GMT
Hi, All,
  Thanks a lot, the problem is finally solved by replace the input format
from Double to Int which saves memory. But new problem is that the program
will be aborted in superstep 2 as:
14/04/04 10:45:39 INFO job.JobProgressTracker: Data from 5 workers -
Compute superstep 2: 0 out of 3029732 vertices computed; 0 out of 40
partitions computed; min free memory on worker 5 - 378.62MB, average
546.85MB
14/04/04 10:45:44 INFO job.JobProgressTracker: Data from 7 workers -
Compute superstep 2: 0 out of 4241624 vertices computed; 0 out of 56
partitions computed; min free memory on worker 1 - 273.06MB, average
393.52MB
14/04/04 10:45:49 INFO job.JobProgressTracker: Data from 8 workers -
Compute superstep 2: 0 out of 4847571 vertices computed; 0 out of 64
partitions computed; min free memory on worker 1 - 273.06MB, average
356.06MB
14/04/04 10:45:54 INFO job.JobProgressTracker: Data from 8 workers -
Compute superstep 2: 0 out of 4847571 vertices computed; 0 out of 64
partitions computed; min free memory on worker 5 - 249.72MB, average
339.95MB
14/04/04 10:45:56 INFO zookeeper.ClientCnxn: Unable to read additional data
from server sessionid 0x1452d699cb30009, likely server has closed socket,
closing socket connection and attempting reconnect
14/04/04 10:45:58 INFO zookeeper.ClientCnxn: Opening socket connection to
server compute-0-22.local/10.1.255.232:22181. Will not attempt to
authenticate using SASL (unknown error)
14/04/04 10:45:58 WARN zookeeper.ClientCnxn: Session 0x1452d699cb30009 for
server null, unexpected error, closing socket connection and attempting
reconnect
java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
    at
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
    at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)

Each rerun of the program will lead to another computing node reporting the
same error("Unable to read additional data from server sessionid...").

What in superstep 2 are:
  if (getSuperstep() == 2) {
    for (IntWritable message: messages) {
        for (Edge<IntWritable, IntWritable> edge: vertex.getEdges()) {
           sendMessage(edge.getTargetVertexId(), message);
           //int abc=0;
        }
    }
  }

Checked that if I replace the line "sendMessage(edge.getTargetVertexId(),
message);" to another meaningless line like "int abc=0;", the program could
be finished successfully. Seems a ZooKeeper problem but this seems comes
with giraph as I did not install ZooKeeper seperately. Any hints?

  Best Regards,
  Suijian




2014-04-03 5:00 GMT-05:00 Lukas Nalezenec <lukas.nalezenec@firma.seznam.cz>:

>  Hi,
> Try finding master and check what is it doing in jobtracker.
>
> Lukas
>
>
> On 2.4.2014 23:58, Suijian Zhou wrote:
>
>  Hi,
>    Why the giraph program will stuck when loading input graph( the size of
> the graph is 500MB, not so big)? No matter how I try different number of
> workers( from -w 2 to -w 30) or the -Xmx parameter of
> mapred.child.java.opts or mapred.tasktracker.map.tasks.maximum, the program
> always stuck there(has already loaded ~95% of the whole graph). Any
> possible reasons( other parameters to tune)? The input data itself and the
> giraph program are correct as tested on very small portion of the graph.
> Each node has 16GB of RAM.
>
> 14/04/02 16:43:59 INFO job.JobProgressTracker: Data from 2 workers -
> Loading data: 4258452 vertices loaded, 86 vertex input splits loaded; 0
> edges loaded, 0 edge input splits loaded; min free memory on worker 2 -
> 128.71MB, average 162.83MB
> 14/04/02 16:44:04 INFO job.JobProgressTracker: Data from 2 workers -
> Loading data: 4258452 vertices loaded, 86 vertex input splits loaded; 0
> edges loaded, 0 edge input splits loaded; min free memory on worker 2 -
> 123.49MB, average 160.22MB
> 14/04/02 16:44:09 INFO job.JobProgressTracker: Data from 2 workers -
> Loading data: 4258452 vertices loaded, 86 vertex input splits loaded; 0
> edges loaded, 0 edge input splits loaded; min free memory on worker 2 -
> 123.49MB, average 159.53MB
> 14/04/02 16:44:14 INFO job.JobProgressTracker: Data from 2 workers -
> Loading data: 4258452 vertices loaded, 86 vertex input splits loaded; 0
> edges loaded, 0 edge input splits loaded; min free memory on worker 2 -
> 123.49MB, average 159.53MB
>
>
>    Best Regards,
>    Suijian
>
>
>

Mime
View raw message