giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eli Reisman <>
Subject Re: GiraphFileInputFormat questions
Date Mon, 11 Feb 2013 18:49:51 GMT
Its great to hear that, because its what I'm torn about too: how do you not
duplicate Hadoop code, but keep Giraph's framework ties loosely coupled as
I add YARN code. I'm really trying to avoid a munge flag, but at this point
I'm getting stuck because the YARN setup code won't compile with all of our
Hadoop profiles anyway. So now I'm just trying to minimize the number of
"munge points" in the Giraph code. This will make the glue much cleaner!

In the end, it sounds like I will be able to avoid duplicating the input
split code since you have it done there. But it sounds like I must still
duplicate from hadoop the code that actually feeds the record readers and
commits output, since we have no Hadoop and no GraphMapper with In and Out
params to give it to us any more. Thats still less than I thought I would
have to duplicate/deal with. Yay!

On Fri, Feb 8, 2013 at 3:06 PM, Alessandro Presta <> wrote:

> Hi Eli,
> Yes, GiraphFileInputFormat deals with input splitting in all cases. Note
> that most of the logic is the same as in current Hadoop, and we extend
> Hadoop's FileInputFormat.
> I wish there was a way to avoid any code duplication, but this is messing
> with implementation-specific code that is mostly private.
> Alessandro
> On 2/8/13 2:58 PM, "Eli Reisman" <> wrote:
> >Hey (maybe @Alessandro, don't know...) I have been looking at the
> >GiraphFileInputFormat. Am I crazy, or with the advent of edge or vertex
> >based input files, do we now always generate our own input splits, from
> >scratch, without hadoop being involved? And if so, is this defaulted to
> >"on" no matter what, or only when we have dual edge-vertex input
> >information to process? If so, its one less thing I will have to implement
> >for the YARN implementation.
> >
> >Thanks, looking forward to hearing back,
> >
> >Eli

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message