giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Martin Neumann <mneum...@spotify.com>
Subject Re: Different num supersteps
Date Thu, 27 Feb 2014 10:44:21 GMT
The data I have as input is not in a Graph-Format so I use an
EdgeInputFormat to create a Graph. Its also deterministic so the same Graph
should be build with the same input.

Each line in the input is a set of connected vertices.
I create edges in a way that they form a star around the vertex with the
lowest ID. Every vertex has an edge to that vertex and it has an edge to
all other vertices. See the code further down.

I will see if I can sort the output and then do a diff to see if they have
the same results. Is there a simple way to make sure that the same graph
was build? (The full dataset has around a billion vertices so I can't
visualize it)



private class ConCompInReader extends TextEdgeReader {
        private Text[] nodeArray = null;
        private int otherNodeId = -1;
        NullWritable edgeValue = NullWritable.get();
        // flag used to create edges in both directions
        private boolean flipped = true;
        private boolean done = true;

        @Override
        public final Text getCurrentSourceId() throws IOException,
                InterruptedException {
            if (flipped)
                return nodeArray[otherNodeId];
            return nodeArray[0];
        }

        @Override
        public final Edge<Text, NullWritable> getCurrentEdge()
                throws IOException, InterruptedException {
            if (flipped)
                return EdgeFactory.create(nodeArray[0], edgeValue);
            return EdgeFactory.create(nodeArray[otherNodeId], edgeValue);
        }

        @Override
        public final boolean nextEdge() throws IOException,
                InterruptedException {

            // flipp direction of the edge
            if(!flipped){
                flipped = true;
            }

            // line not yet done and already created the flipped version
            else if(!done){
                otherNodeId++;
                flipped = false;
                if (otherNodeId >= nodeArray.length){
                    done = true;
                }
            }

            // TODO this version ignores singels switch with version below
if needed
            // if done with this line read the next one if there is one
            // if current line is consumed it checks if there is a next one
            if (done) {
                boolean hasNext = getRecordReader().nextKeyValue();
                while (done && hasNext) {

                    String[] stringNodeArray =
getRecordReader().getCurrentValue().toString().split(
                            "\\s+");

                    // found line with more than one URL
                    if (stringNodeArray.length >= 2) {
                        otherNodeId = 1;
                        done = false;
                        flipped = false;
                        Arrays.sort(stringNodeArray);
                        nodeArray = new Text[stringNodeArray.length];
                        for (int i = 0; i < stringNodeArray.length; i++) {
                            nodeArray[i] = new Text(stringNodeArray[i]);
                        }
                    }
                    // if not ignore line and read a new one
                    else {
                        hasNext = getRecordReader().nextKeyValue();
                    }
                }
            }

            // TODO version that does not ignore singels
            /*if (done && getRecordReader().nextKeyValue()) {
                arrayNode = getRecordReader().getCurrentValue();
                String[] stringNodeArray =
arrayNode.toString().split("\\s+");

                otherNodeId = 1;
                done = false;
                flipped = false;
                Arrays.sort(stringNodeArray);
                nodeArray = new Text[stringNodeArray.length];
                for (int i = 0; i < stringNodeArray.length; i++) {
                    nodeArray[i] = new Text(stringNodeArray[i]);
                }

            }*/

            return !done;
        }
    }


On Thu, Feb 27, 2014 at 10:57 AM, Sebastian Schelter <ssc@apache.org> wrote:

> Hi Martin,
>
> You are right, this should not happen, your code looks correct. One way to
> check the output would be to simply count the number vertices per component
> and see if that number stays stable.
>
> Do you supply all vertices in your input data or are some vertices created
> during the computation? Maybe there's a bug/race condition with the vertex
> creation (just wildly guessing)
>
> Best,
> Sebastian
>
>
>
> On 02/27/2014 10:48 AM, Martin Neumann wrote:
>
>> Hej,
>>
>> I have modified the connected component example to fit my input data. I
>> expect it to be deterministic.
>>
>> But when I run it multiple times it takes a different number of Super
>> steps. This only happens on the complete dataset and not on my small test
>> dataset.
>>   (So I cannot check the output for correctness in a simple way)
>>
>> Has anyone an Idea how this could happen?
>>
>> cheers Martin
>>
>>
>>
>>
>> In case its useful here the computation class:
>>
>> @Override
>>      public void compute(Vertex<Text, Text, NullWritable> vertex,
>>              Iterable<Text> inmessage) throws IOException {
>>          boolean changed = false;
>>
>> // first superstep && setup
>>          if (getSuperstep() == 0) {
>>              //initialize value
>>              vertex.setValue(vertex.getId());
>>              //cheating by checking the neighbors ID's (cuts down 1
>> iteration)
>>              for (Edge<Text, NullWritable> e : vertex.getEdges()) {
>>                  Text candidate = e.getTargetVertexId();
>>                  if (candidate.compareTo(vertex.getValue()) < 0) {
>>                      changed = true;
>>                      vertex.setValue(candidate);
>>                  }
>>              }
>>          }
>>
>>          // other superstep
>>          else {
>>              // read all messages and compare with own state
>>              for (Text message : inmessage) {
>>                  if (message.compareTo(vertex.getValue()) < 0) {
>>                      changed = true;
>>                      vertex.setValue(message);
>>                  }
>>              }
>>          }
>>
>>          // if state has changed send a message to all neighbors
>>          if (changed && getSuperstep() < limiter) {
>>              sendMessageToAllEdges(vertex, vertex.getValue());
>>          }
>>
>>          vertex.voteToHalt();
>>      }
>>
>>
>

Mime
View raw message