giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Schweiger, Tom" <thschwei...@ebay.com>
Subject RE: How do I output only a subset of a graph?
Date Mon, 25 Aug 2014 18:57:18 GMT

I think you answered your question "Or am I supposed to write a VertexOutputFormat implementation
that generates no output for the vertices that have no data?", as in YES!.

But don't be put off; It is actually a very simple class to override.  Here is an example
for something like you describe:


package com.ebay.foo.bar.giraph.io.formats;

import org.apache.giraph.graph.Vertex;
import org.apache.giraph.io.formats.TextVertexOutputFormat;
import org.apache.hadoop.io.BooleanWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.TaskAttemptContext;

import java.io.IOException;

public class ExampleOutputFormat extends
        TextVertexOutputFormat<Text, Text, BooleanWritable> {

    public class ExampleWriter extends TextVertexWriter {

        @Override
        public void writeVertex(
                Vertex<Text, Text, BooleanWritable> vertex)
                throws IOException, InterruptedException {
            if (!vertex.getValue().toString().isEmpty())
                getRecordWriter().write(vertex.getId(), vertex.getValue());
        }
    }

}

    @Override
    public TextVertexWriter createVertexWriter(TaskAttemptContext context)
            throws IOException, InterruptedException {
        return new ExampleWriter();
    }

}



Thomas A J Schweiger
Sr. Software Architect
GDI-Inc Data Services-Seattle

[X]
Office: (425) 586-2669
email: thschweiger@ebay.com<mailto:thschweiger@ebay.com>
________________________________
From: matthewcornell@gmail.com [matthewcornell@gmail.com] on behalf of Matthew Cornell [matt@matthewcornell.org]
Sent: Monday, August 25, 2014 11:38 AM
To: user
Subject: How do I output only a subset of a graph?

Hi Folks. I have a graph computation that starts with a subset of vertices of a certain type
and propagates information through the graph to a set of target vertices, which are also subset
of the graph. I want to output only information from those particular vertices, but I don't
see a way to do this in the various VertexOutputFormat subclasses, which all seem oriented
to outputting something for every vertex in the graph. How do I do this? E.g., are there hooks
for the output phase where I can filter output? Or am I supposed to write a VertexOutputFormat
implementation that generates no output for the vertices that have no data? Thanks in advance.

--
Matthew Cornell | matt@matthewcornell.org<mailto:matt@matthewcornell.org> | 413-626-3621
| 34 Dickinson Street, Amherst MA 01002 | matthewcornell.org<http://matthewcornell.org>

Mime
View raw message