giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From chadi jaber <chadijaber...@hotmail.com>
Subject RE: Problem with Giraph (solved)
Date Fri, 17 Jan 2014 07:52:00 GMT
Hello everybody,
I don't know if somebody is interested by the solution for this problem but i am posting it
anyway.The problem is in the input format class TextVertexInputFormat<IntWritable, NullWritable,
NullWritable> returning Null for Vertex value. I think that the giraph job  when having
nb workers greater than 1 partition the vertices as soon as it's read by the inputformat.
This partitioning involves sending vertex request to other workers entailing serialization/deserialization
processes. With the vertex value to null the serialization/deserialization processes failed.
The solution was to change the inputformat class to TextVertexInputFormat<IntWritable,
DoubleWritable, NullWritable> and giving a zero value to the vertice.
That's all.Anyway thanks to Maja and Lukas for their helpChadi

From: chadijaber986@hotmail.com
To: user@giraph.apache.org; lukas.nalezenec@firma.seznam.cz; majakabiljo@fb.com
Subject: RE: Problem with Giraph (please help me)
Date: Mon, 13 Jan 2014 10:59:43 +0100




Hello again;Thanks very much for your answers Maja and Lukas.@Maja:I haven't chosen any special
OutEdges class . the default class is used ByteArrayEdges. @Lukas:I have used your observer
class. Nothing is logged when the number of workers is greater than 1.I suppose that the problem
is even before any superstep execution.So i tried to change the Vertex ids to IntWritable
but nothing changed the same exception appears in one of the workers logs (No exceptions in
other workers or master).I think that problem occurs when data is read from input data as
suggested by the exception 
2014-01-13 10:31:03,030 INFO org.apache.giraph.worker.BspServiceWorker: setup: Finally loaded
a total of (v=0, e=0)2014-01-13 10:31:03,095 INFO org.apache.giraph.comm.netty.handler.RequestDecoder:
decode: Server window metrics MBytes/sec sent = 0, MBytes/sec received = 0, MBytesSent = 0,
MBytesReceived = 0, ave sent req MBytes = 0, ave received req MBytes = 0, secs waited = 0.862014-01-13
10:31:03,103 WARN org.apache.giraph.comm.netty.handler.RequestServerHandler: exceptionCaught:
Channel failed with remote address /---.--.--.--:37778java.io.EOFException	at org.jboss.netty.buffer.ChannelBufferInputStream.checkAvailable(ChannelBufferInputStream.java:231)
at org.jboss.netty.buffer.ChannelBufferInputStream.readInt(ChannelBufferInputStream.java:174)
at org.apache.giraph.edge.ByteArrayEdges.readFields(ByteArrayEdges.java:172)	at org.apache.giraph.utils.WritableUtils.reinitializeVertexFromDataInput(WritableUtils.java:480)
at org.apache.giraph.utils.WritableUtils.readVertexFromDataInput(WritableUtils.java:511)	at
org.apache.giraph.partition.SimplePartition.readFields(SimplePartition.java:126)	at org.apache.giraph.comm.requests.SendVertexRequest.readFieldsRequest(SendVertexRequest.java:66)
at org.apache.giraph.comm.requests.WritableRequest.readFields(WritableRequest.java:120)	at
org.apache.giraph.comm.netty.handler.RequestDecoder.decode(RequestDecoder.java:92)	at org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:72)
at org.jboss.netty.handler.execution.ChannelEventRunnable.run(ChannelEventRunnable.java:69)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
As I explained in my email. I am doing the graph construction during the first steps and to
bootstrap the giraph job i supply a file with only 0;0 the class used to read the input splits
is really simple:import com.airbus.sviert.merge.MergeVertex;import com.google.common.collect.ImmutableList;
import org.apache.giraph.edge.Edge;import org.apache.giraph.io.formats.TextVertexInputFormat;import
org.apache.hadoop.io.IntWritable;import org.apache.hadoop.io.NullWritable;import org.apache.hadoop.io.Text;import
org.apache.hadoop.mapreduce.InputSplit;import org.apache.hadoop.mapreduce.TaskAttemptContext;
import java.io.IOException;

public class NodesKeyOnlyVertexInputFormat extends TextVertexInputFormat<IntWritable, NullWritable,
NullWritable> {	@Override	public TextVertexReader createVertexReader(			InputSplit split,
TaskAttemptContext context) throws IOException {		return new NodesKeyOnlyVertexReader();	}
	/**	 * Reader for this InputFormat.	 */	public class NodesKeyOnlyVertexReader extends	TextVertexReaderFromEachLineProcessed<String>
{		/** Cached vertex id */		private IntWritable id;
		@Override		protected String preprocessLine(Text line) throws IOException {			String[] tab=line.toString().split(";");
		id = MergeVertex.cvtPointToLong(Integer.parseInt(tab[0]),Integer.parseInt(tab[1]));			return
line.toString();		}
		@Override		protected IntWritable getId(String line) throws IOException {			return id;		}
		@Override		protected NullWritable getValue(String line) throws IOException {			return NullWritable.get();
	}
		@Override		protected Iterable<Edge<IntWritable, NullWritable>> getEdges(String
line)				throws IOException {			return ImmutableList.of();		}	}}
Do you think that it can be from this kind of manipulations?

Thanks in advance,Best Regards,
Chadi
From: majakabiljo@fb.com
To: user@giraph.apache.org; lukas.nalezenec@firma.seznam.cz
Subject: Re: Problem with Giraph (please help me)
Date: Thu, 9 Jan 2014 18:30:53 +0000






Hi Chadi, 



That does seem like a serialization issue. Which OutEdges class are you using, is it something
you implemented? 



Regards,
Maja





From: chadi jaber <chadijaber986@hotmail.com>

Reply-To: "user@giraph.apache.org" <user@giraph.apache.org>

Date: Thursday, January 9, 2014 2:08 AM

To: Lukas Nalezenec <lukas.nalezenec@firma.seznam.cz>, "user@giraph.apache.org" <user@giraph.apache.org>

Subject: RE: Problem with Giraph (please help me)







Hello Lukas
I have enclosed in my previous emails the exception. It seems to be a serialization issue
(This occurs only when workers > 1)




...
2013-12-31 16:27:33,494 INFO org.apache.giraph.comm.netty.NettyClient: connectAllAddresses:
Successfully added 4 connections, (4 total
 connected) 0 failed, 0 failures total.
2013-12-31 16:27:33,501 INFO org.apache.giraph.worker.BspServiceWorker: loadInputSplits: Using
1 thread(s), originally 1 threads(s)
 for 1 total splits.
2013-12-31 16:27:33,508 INFO org.apache.giraph.comm.SendPartitionCache: SendPartitionCache:
maxVerticesPerTransfer = 10000
2013-12-31 16:27:33,508 INFO org.apache.giraph.comm.SendPartitionCache: SendPartitionCache:
maxEdgesPerTransfer = 80000
2013-12-31 16:27:33,524 INFO org.apache.giraph.worker.InputSplitsCallable: call: Loaded 0
input splits in 0.020270009 secs, (v=0,
 e=0) 0.0 vertices/sec, 0.0 edges/sec
2013-12-31 16:27:33,527 INFO org.apache.giraph.comm.netty.NettyClient: waitAllRequests: Finished
all requests. MBytes/sec sent = 0,
 MBytes/sec received = 0, MBytesSent = 0, MBytesReceived = 0, ave sent req MBytes = 0, ave
received req MBytes = 0, secs waited = 0.656
2013-12-31 16:27:33,527 INFO org.apache.giraph.worker.BspServiceWorker: setup: Finally loaded
a total of (v=0, e=0)
2013-12-31 16:27:33,598 INFO org.apache.giraph.comm.netty.handler.RequestDecoder: decode:
Server window metrics MBytes/sec sent =
 0, MBytes/sec received = 0, MBytesSent = 0, MBytesReceived = 0, ave sent req MBytes = 0,
ave received req MBytes = 0, secs waited = 0.816
2013-12-31 16:27:33,605 WARN org.apache.giraph.comm.netty.handler.RequestServerHandler: exceptionCaught:
Channel failed with remote
 address /
java.io.EOFException
at org.jboss.netty.buffer.ChannelBufferInputStream.checkAvailable(ChannelBufferInputStream.java:231)
at org.jboss.netty.buffer.ChannelBufferInputStream.readInt(ChannelBufferInputStream.java:174)
at org.apache.giraph.edge.ByteArrayEdges.readFields(ByteArrayEdges.java:172)
at org.apache.giraph.utils.WritableUtils.reinitializeVertexFromDataInput(WritableUtils.java:480)
at org.apache.giraph.utils.WritableUtils.readVertexFromDataInput(WritableUtils.java:511)
at org.apache.giraph.partition.SimplePartition.readFields(SimplePartition.java:126)
at org.apache.giraph.comm.requests.SendVertexRequest.readFieldsRequest(SendVertexRequest.java:66)
at org.apache.giraph.comm.requests.WritableRequest.readFields(WritableRequest.java:120)
at org.apache.giraph.comm.netty.handler.RequestDecoder.decode(RequestDecoder.java:92)
at org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:72)
at org.jboss.netty.handler.execution.ChannelEventRunnable.run(ChannelEventRunnable.java:69)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)


the code for my vertex compute function :


public class MergeVertex extends
Vertex<LongWritable,DoubleWritable, DoubleWritable, NodeMessage> {


...


/***
 * Convert a Vertex Id from its LongWritable format
 to Point format (2 Element Array Format)
 * @param lng LongWritable Format of the VertexId
 * @return Alignment point Array
 */
public static int[] cvtLongToPoint(LongWritable
 lng){
int[] point={0,0};


point[0]=(int) (lng.get()/1000);
point[1]=(int) (lng.get()% 1000);


return point;
}


@Override
public void compute(Iterable<NodeMessage> messages)
 throws IOException {


int currentId[]= cvtLongToPoint(getId());


if (getSuperstep()==0) {


//NodeValue nv=new NodeValue();
setValue(new DoubleWritable(0d));
}




_signallength=getContext().getConfiguration().getInt("SignalLength",0);




if((getSuperstep() < _signallength && getId().get()!=0L)
 || (getSuperstep()== 0 && getId().get()==0L)){


LongWritable dstId=new LongWritable();


//Nodes which are on Graph "Spine" //Remaining
 Edges Construction
if(currentId[0]== currentId[1]){


//right Side
for (int i=currentId[1]+1;i<_signallength;i++){
dstId=cvtPointToLong(currentId[0]+1,i);
addVertexRequest(dstId,new DoubleWritable(Double.MAX_VALUE));
addEdgeRequest(getId(),EdgeFactory.create(dstId,
 new DoubleWritable(computeCost(getId(),dstId))));
}


//Left Side
for (int i=currentId[0]+2;i<_signallength;i++){
dstId=cvtPointToLong(i,currentId[1]+1);
addVertexRequest(dstId,new DoubleWritable(Double.MAX_VALUE));
addEdgeRequest(getId(),EdgeFactory.create(dstId,
 new DoubleWritable(computeCost(getId(),dstId))));
}


//Nodes which are not on Graph "Spine" //Remaining
 Edges Construction


}else{


//right Side
if(currentId[0]+1<_signallength){
for (int i=currentId[1]+1;i<_signallength;i++){
dstId=cvtPointToLong(currentId[0]+1,i);
addEdgeRequest(getId(),EdgeFactory.create(dstId,
 new DoubleWritable(computeCost(getId(),dstId))));
}
}


//Left Side
if(currentId[1]+1<_signallength){
for (int i=currentId[0]+2;i<_signallength;i++){
dstId=cvtPointToLong(i,currentId[1]+1);
addEdgeRequest(getId(),EdgeFactory.create(dstId,
 new DoubleWritable(computeCost(getId(),dstId))));
}
}


}


//No need to other vertex than source to be active
if(getId().get() != 0L){
voteToHalt();
}


}else if (getSuperstep() >= _signallength && getSuperstep()
 < 2*_signallength){


double minDist;
long minSource=0L;




if(getId().get() == 0L){
minDist=0;
}else{
minDist=Double.MAX_VALUE;
}


for(NodeMessage message : messages){
if(minDist > message.get()){
minDist=message.get();
minSource=message.getSourceID();
}
}




if (minDist < getValue().get()){
setValue(new DoubleWritable(minDist));



for (Edge<LongWritable, DoubleWritable> edge :
 getEdges()) {
double distance = minDist + edge.getValue().get();
sendMessage(edge.getTargetVertexId(),
new NodeMessage(distance,getId().get()));
}
}






//Only last Node is active
if(currentId[0] != _signallength-1 || currentId[1]
 != _signallength-1){
voteToHalt();
}






}else if(getSuperstep() >= 2*_signallength){
voteToHalt();
}


}
 
If you need more details please don't hesitate.


Thanks in advance,
Chadi




Date: Thu, 9 Jan 2014 10:49:54 +0100

From: lukas.nalezenec@firma.seznam.cz

To: chadijaber986@hotmail.com

Subject: Re: Problem with Giraph (please help me)



Hi, 

Find the mapper running on the remote address and check what happened. Maybe there will be
exception.

Lukas



On 9.1.2014 09:38, chadi jaber wrote:


exceptionCaught: Channel failed with remote address /---------







 		 	   		   		 	   		  
Mime
View raw message