giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Claudio Martella <claudio.marte...@gmail.com>
Subject Re: How to retrieve and display the values aggregated by the aggregators?
Date Wed, 31 Jul 2013 16:46:41 GMT
Hi Kyle,

good catch. ALWAYS should be set to 1. Want to write a patch to fix this?
Try to set the property on the command line by putting -D
giraph.textAggregatorWriter.frequency=-1 right after the GiraphRunner class
in your command line.

Hope this helps.

Best,
Claudio


On Wed, Jul 24, 2013 at 10:31 PM, Kyle Orlando <kyle.r.orlando@gmail.com>wrote:

> Hi Claudio,
>
> So I checked out TextAggregatorWriter and was, initially, still a bit
> confused on how to use it to write to a text file. That's when I
> noticed that, in org.apache.giraph.utils.ConfigurationUtils, there is
> an option "aw", which corresponds to an AggregatorWriterClass.  I
> tried this out when running the SimplePageRankComputation program
> using my data as input by specifying this as an option:
>
> -aw org.apache.giraph.aggregators.TextAggregatorWriter.
>
> Here's the full command:
>
> hadoop jar /home/hduser/Documents/combined.jar
> org.apache.giraph.GiraphRunner
> org.apache.giraph.examples.SimplePageRankComputation -eif
> StackExchangeParsee.StackExchangeLongFloatTextEdgeInput -vif
> StackExchangeParsee.StackExchangeLongDoubleTextVertexValueInput -eip
> /in/gaming_edges.txt -vip /in/gaming_vertices.txt -of
> org.apache.giraph.io.formats.IdWithValueTextOutputFormat -aw
> org.apache.giraph.aggregators.TextAggregatorWriter -op /outPR -w 2 -mc
>
> org.apache.giraph.examples.SimplePageRankComputation\$SimplePageRankMasterCompute
>
> According to TextAggregatorWriter, it, by default, writes to a file
> called "aggregatorValues". I checked my HDFS, and did not see that
> particular file.  That's when I noticed that there is a configuration
> "giraph.textAggregatorWriter.frequency", and that by default the
> frequency is set to "NEVER", which means that nothing is ever
> created/written to a file for the aggregators.  The other two
> frequencies are "AT_THE_END" and "ALWAYS", which strangely both
> correspond to the same integer: -1. Could someone explain why this is
> so?
>
> Ignoring the above uncertainty, I surmised that the property
> "giraph.textAggregatorWriter.frequency" was to be added to my
> giraph-site.xml. I wanted the "AT_THE_END" frequency, which
> corresponds to the value of -1. Here's the contents of my
> giraph-site.xml file:
>
> <configuration>
>   <property>
>     <name>giraph.textAggregatorWriter.frequency</name>
>     <value>-1</value>
>   </property>
> </configuration>
>
> I ran the SimplePageRankComputation program again (using the verbose
> "hadoop jar" command above), and still, I couldn't find
> "aggregatorValues" on my HDFS.
>
> Could someone help me out, or at the very least rectify any
> misconceptions and uncertainties that I have?
>
>
>
> On Wed, Jul 24, 2013 at 12:25 PM, Claudio Martella
> <claudio.martella@gmail.com> wrote:
> > Hi Kyle,
> >
> > you can check out the AggregatorWriter interface which allows you to do
> > that. As a matter of fact there is already a class that implements what
> you
> > need (org.apache.giraph.aggregators.TextAggregatorWriter).
> >
> > Hope it helps.
> >
> >
> > On Wed, Jul 24, 2013 at 5:19 PM, Kyle Orlando <kyle.r.orlando@gmail.com>
> > wrote:
> >>
> >> Hello,
> >>
> >> I am new to Giraph and was just wondering how one could retrieve and
> >> display the certain global values/statistics that the aggregators keep
> >> track of.  What classes and methods would I use, and would this be
> >> done in a class that extends VertexOutputFormat, or would it be done
> >> elsewhere?
> >>
> >> As an example, in the provided SimplePageRankComputation in
> >> org.apache.giraph.examples, there are three aggregators: sum, min, and
> >> max.  I would like to display all of their final values (after the
> >> final superstep) in some way, such as writing them to a text file.
> >>
> >> --
> >> Kyle Orlando
> >> Computer Engineering Major
> >> University of Maryland
> >
> >
> >
> >
> > --
> >    Claudio Martella
> >    claudio.martella@gmail.com
>
>
>
> --
> Kyle Orlando
> Computer Engineering Major
> University of Maryland
>



-- 
   Claudio Martella
   claudio.martella@gmail.com

Mime
View raw message