giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Claudio Martella <claudio.marte...@gmail.com>
Subject Re: [jira] [Commented] (GIRAPH-524) Giraph can receive input from vertex or edge-centric data sets; its output is graph data, not "vertices"
Date Tue, 19 Feb 2013 17:15:31 GMT
I think, as Nitay pointed out, that the Vertex prefix to OutputFormat
points out the content of a *record* (hence it is true for both
SequenceFile and Text), as much as the same is happening in the
InputFormats.
As much as it can look useless now, if the output of a giraph job is fed
into a M/R job, then this specification gets suddenly very important.


On Tue, Feb 19, 2013 at 5:55 PM, Eli Reisman <apache.mailbox@gmail.com>wrote:

> I was thinking something more "output format" centric and less vertex
> centric in the name. Someone could output edge weights or vertex values, in
> the end its still a list of data points to the end user (or the person who
> will use the data in the next stage of a map reduce job or whatever) so why
> not call them things like "VertexValueSequenceFileOutputFormat" (with a
> base of SequenceFileOutputFormat) or "EdgeWeightTextOutputFormat" (with a
> base of TextOutputFormat) ?
>
> Anyway this is one of many ideas, just looking at the IO formats the other
> day and thought this is worth discussing. I'm very open to ideas.
>
> The main gist of my concern: the "Vertex" IO format names were originally
> to help those new to BSP get into the spirit of "think like a vertex" but
> now for us that really only applies to the application writing, the IO
> formats have reassigned the Vertex and Edge labels to mean something else
> for input formats, but still the "think like a vertex" names remain on the
> output side. Thats my main concern. Any improvement is acceptable to me!
>
>
>
> On Sun, Feb 17, 2013 at 2:17 PM, Nitay Joffe (JIRA) <jira@apache.org>
> wrote:
>
> >
> >     [
> >
> https://issues.apache.org/jira/browse/GIRAPH-524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13580301#comment-13580301
> ]
> >
> > Nitay Joffe commented on GIRAPH-524:
> > ------------------------------------
> >
> > I agree we should support other outputs, for example some edge output.
> The
> > existing VertexOutputFormat does have a vertex-centric thing to it
> though,
> > so I'm not sure we should rename it to GraphXX though? VertexOutputFormat
> > has a createVertexWriter which iterates for each vertex and allows you to
> > do whatever you want. We should also have an EdgeOutputFormat that
> iterates
> > over edges and allows output that way. Anything else we should support?
> >
> > > Giraph can receive input from vertex or edge-centric data sets; its
> > output is graph data, not "vertices"
> > >
> >
> --------------------------------------------------------------------------------------------------------
> > >
> > >                 Key: GIRAPH-524
> > >                 URL: https://issues.apache.org/jira/browse/GIRAPH-524
> > >             Project: Giraph
> > >          Issue Type: Bug
> > >          Components: graph
> > >            Reporter: Eli Reisman
> > >            Priority: Minor
> > >             Fix For: 0.2.0
> > >
> > >
> > > It is silly to have any of our Output format names tied to the "vertex"
> > when in fact we are just outputting graph data. The output format names
> > should reflect the formatting of the output, and perhaps which elements
> of
> > the graph data you want in the output.
> > > Lets change those names? Then they get shorter too as a bonus.
> >
> > --
> > This message is automatically generated by JIRA.
> > If you think it was sent incorrectly, please contact your JIRA
> > administrators
> > For more information on JIRA, see:
> http://www.atlassian.com/software/jira
> >
>



-- 
   Claudio Martella
   claudio.martella@gmail.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message