giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kyle Orlando <kyle.r.orla...@gmail.com>
Subject Re: How to retrieve and display the values aggregated by the aggregators?
Date Wed, 24 Jul 2013 20:31:47 GMT
Hi Claudio,

So I checked out TextAggregatorWriter and was, initially, still a bit
confused on how to use it to write to a text file. That's when I
noticed that, in org.apache.giraph.utils.ConfigurationUtils, there is
an option "aw", which corresponds to an AggregatorWriterClass.  I
tried this out when running the SimplePageRankComputation program
using my data as input by specifying this as an option:

-aw org.apache.giraph.aggregators.TextAggregatorWriter.

Here's the full command:

hadoop jar /home/hduser/Documents/combined.jar
org.apache.giraph.GiraphRunner
org.apache.giraph.examples.SimplePageRankComputation -eif
StackExchangeParsee.StackExchangeLongFloatTextEdgeInput -vif
StackExchangeParsee.StackExchangeLongDoubleTextVertexValueInput -eip
/in/gaming_edges.txt -vip /in/gaming_vertices.txt -of
org.apache.giraph.io.formats.IdWithValueTextOutputFormat -aw
org.apache.giraph.aggregators.TextAggregatorWriter -op /outPR -w 2 -mc
org.apache.giraph.examples.SimplePageRankComputation\$SimplePageRankMasterCompute

According to TextAggregatorWriter, it, by default, writes to a file
called "aggregatorValues". I checked my HDFS, and did not see that
particular file.  That's when I noticed that there is a configuration
"giraph.textAggregatorWriter.frequency", and that by default the
frequency is set to "NEVER", which means that nothing is ever
created/written to a file for the aggregators.  The other two
frequencies are "AT_THE_END" and "ALWAYS", which strangely both
correspond to the same integer: -1. Could someone explain why this is
so?

Ignoring the above uncertainty, I surmised that the property
"giraph.textAggregatorWriter.frequency" was to be added to my
giraph-site.xml. I wanted the "AT_THE_END" frequency, which
corresponds to the value of -1. Here's the contents of my
giraph-site.xml file:

<configuration>
  <property>
    <name>giraph.textAggregatorWriter.frequency</name>
    <value>-1</value>
  </property>
</configuration>

I ran the SimplePageRankComputation program again (using the verbose
"hadoop jar" command above), and still, I couldn't find
"aggregatorValues" on my HDFS.

Could someone help me out, or at the very least rectify any
misconceptions and uncertainties that I have?



On Wed, Jul 24, 2013 at 12:25 PM, Claudio Martella
<claudio.martella@gmail.com> wrote:
> Hi Kyle,
>
> you can check out the AggregatorWriter interface which allows you to do
> that. As a matter of fact there is already a class that implements what you
> need (org.apache.giraph.aggregators.TextAggregatorWriter).
>
> Hope it helps.
>
>
> On Wed, Jul 24, 2013 at 5:19 PM, Kyle Orlando <kyle.r.orlando@gmail.com>
> wrote:
>>
>> Hello,
>>
>> I am new to Giraph and was just wondering how one could retrieve and
>> display the certain global values/statistics that the aggregators keep
>> track of.  What classes and methods would I use, and would this be
>> done in a class that extends VertexOutputFormat, or would it be done
>> elsewhere?
>>
>> As an example, in the provided SimplePageRankComputation in
>> org.apache.giraph.examples, there are three aggregators: sum, min, and
>> max.  I would like to display all of their final values (after the
>> final superstep) in some way, such as writing them to a text file.
>>
>> --
>> Kyle Orlando
>> Computer Engineering Major
>> University of Maryland
>
>
>
>
> --
>    Claudio Martella
>    claudio.martella@gmail.com



-- 
Kyle Orlando
Computer Engineering Major
University of Maryland

Mime
View raw message