giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Claudio Martella <claudio.marte...@gmail.com>
Subject Re: Multiple node types in Giraph and doing a selective M/R over one of them
Date Mon, 28 Jan 2013 19:49:27 GMT
One more general point would be whether giraph is a better tool for your
problem. From my understanding, map reduce is probably the way to go.

On Monday, January 28, 2013, Eli Reisman wrote:

> I agree, something like this is possible using the vertex value. In
> giraph, we now have native support for multigraphs, but before we had that
> support, I described a kind of "cheat" to process multigraphs. You could
> use a variation of that same cheat (its on the site confluence wiki) to do
> what you're talking about I think, even though you're not dealing with a
> multigraph in the problem you described. Essentially, you can get clever
> about what sort of Writable you use for the vertex value type, and/or what
> the values it holds can represent in your dataset.
>
> Alternately, in the off chance that the row-keys do not repeat in the
> tables, then really the "row key" can be a Writable vertex ID as long as
> each is unique .The only repetition would be the fact that other rows with
> their own unique row-keys contain row values that mark out-edges to other
> unique row-keys in the table, but more than once since any row-key could
> have lots of other rows "pointing" an out-edge value towards it. Thinking
> of each row key as unique vertex ID then just turns this into a vanilla
> graph. However, if the row keys are not unique in among all your tables,
> this oversimplifies the problem and you really are stuck wtih the above
> vertex value option.
>
> My point: Giraph has vertex value, ID, out-edge-to-other-vertex ID's, and
> message data types, and as long as the properties required of each for a
> graph are met, and each is a Writable, you can think of the problem (often)
> in one of several ways that Giraph can support.
>
> One last thought: assuming the graph does not mutate during processing,
> you could also write a custom input format that evaluates each row as it
> builds it into a graph vertex data structure, and chooses only row keys
> that are of a certain classification in your use case to make into graph
> data for that job run, simply skipping the other rows as it reads them.
> again, this "solution" depends on the nature of your problem. Just
> something to play with.
>
> Good luck with your use case!
>
> On Mon, Jan 28, 2013 at 7:09 AM, Claudio Martella <
> claudio.martella@gmail.com <javascript:_e({}, 'cvml',
> 'claudio.martella@gmail.com');>> wrote:
>
>> Giraph does not support multipartite graph in a natural way. But you can
>> try to model your different sets through the vertexvalue. You can then
>> propagate it (by composing with the ID?) to the neighbors, and obtain your
>> join.
>>
>>
>> On Mon, Jan 28, 2013 at 2:52 PM, David Koch <ogdude@googlemail.com<javascript:_e({},
'cvml', 'ogdude@googlemail.com');>
>> > wrote:
>>
>>> Hello,
>>>
>>> In Giraph is it possible to have different node types in a graph and
>>> have a Map/Reduce only iterate over nodes of this type and their direct
>>> successors?
>>>
>>> If it sounds a bit cryptic here is something more about our use-case:
>>> We have different HBase tables which we want to "pseudo-join" in
>>> Map/Reduce computations. The node types I mentioned above correspond to the
>>> respective row-key types used in each of those tables, edges are generated
>>> by the fact that the KeyValues in each table can contain row-key values
>>> found in one of the other tables.
>>>
>>> The graph would describe these relations. In a Map/Reduce I then want to
>>> be able to iterate over all nodes of a given type while also having access
>>> to a node's successor nodes in the same Mapper instance or better yet the
>>> same map() call. One would then carry out additional Gets to retrieve the
>>> data from the tables thus doing a fairly crude join.
>>>
>>> The Graph is likely to change so it would be nice if it could be updated
>>> incrementally.
>>>
>>> Does all this sound like something that would be possible with Giraph?
>>>
>>> Thank you,
>>>
>>> /David
>>>
>>>
>>>
>>>
>>
>>
>> --
>>    Claudio Martella
>>    claudio.martella@gmail.com <javascript:_e({}, 'cvml',
>> 'claudio.martella@gmail.com');>
>>
>
>

-- 
   Claudio Martella
   claudio.martella@gmail.com

Mime
View raw message