giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Avery Ching <ach...@apache.org>
Subject Re: Giraph input format restrictions
Date Sun, 19 Feb 2012 22:13:34 GMT
Sorry about the old documentation.  I just updated the shortest paths 
example.  Before major changes to the graph distribution, the vertex ids 
were required to be sorted.  That is no longer the case.  You can input 
vertices in any order.  The only restriction is that the vertex ids must 
be unique (no duplicate vertices).  If there are duplicates an exception 
will be thrown since duplicates are probably not expected and this is 
probably an error.  This could be relaxed in the future as well if need 
be, but we would need to decide on how to handle duplicates.

Thanks for all the great questions!

Avery

On 2/19/12 11:25 AM, yavuz gokirmak wrote:
> Hi,
>
> In Shortest Paths Example it is written that "Currently there is one 
> restriction on the VertexInputFormat that is not obvious. The vertices 
> must be sorted.". I didn't understand the reason of this restriction, 
> why vertices should be ordered?
>
> Secondly, as I understood, we have to transform our initial data into 
> a form that each line corresponds to a vertex(with edge and values if 
> exists) in the graph.
>
> For example, I have a data that each row is corresponds to an edge 
> between to vertices
> format1:
> a b
> a c
> a d
> b c
> b a
> c d
>
> Do I have to convert this file into a format similar to below in order 
> to use with giraph algorithms?
> format2:
> a b c d
> b c a
> c d
>
> thanks..


Mime
View raw message