incubator-giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benjamin Heitmann <benjamin.heitm...@deri.org>
Subject Re: Question about TextInputFormat pattern for parsing e.g. RDF
Date Mon, 12 Mar 2012 20:24:54 GMT

On 12 Mar 2012, at 20:04, Avery Ching wrote:

> 
> My suggestion would be the following:
> 
> Run a MR job to join all your RDFs on the vertex key and you can either convert them
to an easy format to parse with a custom VertexInputFormat of your choice.  If these are one
way relationships, you need not create the target vertex.  If they are undirect relationships,
when you are processing your RDFs in the MR job, add a directed relationship to both vertices.


Avery, thanks for the feedback. 

I was not thinking about using Map-Reduce in that way, but I guess thats a very good idea.


However, besides the amount of pre-processing required for using Giraph/Hadoop, the transient
nature of the Giraph graph, 
is also an issue. The scenario of which I am thinking, is that for each run of my algorithm,
just 1% or less of the data is changed. 
So 99% stay the same every time, and they need to be loaded again for each run. 
That wont be a problem if the computation of the algorithm itself is a lot longer then loading
the graph data. 
However, that might not be always the case. 

So right now I am trying to get a feeling for that trade-off, and for the different alternatives
to solving the main research problem ;) 


Thanks again for the reply, cheers, Benjamin. 
Mime
View raw message