hama-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edward J. Yoon" <edwardy...@apache.org>
Subject Re: Possible Aggregator Problem
Date Wed, 24 Apr 2013 00:41:31 GMT
I found the ticket on JIRA - https://issues.apache.org/jira/browse/HAMA-659

And it seems already fixed.

What is your version of hama here? and can you find some bug in TRUNK[1]?

1. http://svn.apache.org/repos/asf/hama/trunk/graph/src/main/java/org/apache/hama/graph/AggregationRunner.java

On Tue, Apr 23, 2013 at 9:41 PM, Steven van Beelen <smcvbeelen@gmail.com> wrote:
> Could anyone tell me if I'm correct concerning the possible problem I
> posted and replied on in the previous two emails?
>
>
> On Wed, Apr 17, 2013 at 5:08 PM, Steven van Beelen <smcvbeelen@gmail.com>wrote:
>
>> Additionally, I found this in the mail archives:
>>
>> http://mail-archives.apache.org/mod_mbox/hama-user/201210.mbox/%3CCAJ-=ys=W8F5W4aduV+=+yfsvh41xSa22-wNqQRKapadZD+QBag@mail.gmail.com%3E
>> This actually exactly covers my point. Is this still considered as a bug,
>> calling two different aggregate functions in a row?
>>
>>
>> On Wed, Apr 17, 2013 at 2:35 PM, Steven van Beelen <smcvbeelen@gmail.com>wrote:
>>
>>> Hi Thomas,
>>>
>>> Then I guess I did not explain myself clearly.
>>> What you describe is indeed how I think of the AverageAggregator to work,
>>> but if I use the AverageAggregator in my own PageRank implementation it
>>> does not return
>>> the average of all absolute differences but just the average of the sum
>>> of all values.
>>>
>>> The (very) small example graph I use has only five vertices, were the sum
>>> of every vertice it's value is always 1.0.
>>> When I use the AverageAggregator it will always return 0.2 when calling
>>> the getLastAggregatedValue method.
>>> It shouldn't do that right?
>>>
>>>
>>> On Wed, Apr 17, 2013 at 1:18 PM, Thomas Jungblut <
>>> thomas.jungblut@gmail.com> wrote:
>>>
>>>> Hi Steven,
>>>>
>>>> the AverageAggregator is used to determine the average of all absolute
>>>> differences between old pagerank and new pagerank for every vertex.
>>>> This is documented like it should behave in the javadoc of the given
>>>> classes and suffices to track if pagerank values have yet converged or
>>>> not.
>>>>
>>>> What you describe is a perfectly valid way to track the pagerank
>>>> difference
>>>> throughout all supersteps. But this is not how (imho) the
>>>> AverageAggregator
>>>> should behave, so you have to write your own.
>>>>
>>>>
>>>> 2013/4/17 Steven van Beelen <smcvbeelen@gmail.com>
>>>>
>>>> > The values in my case are the DoubleWritable values each vertice has
>>>> and
>>>> > the aggregators aggregate on.
>>>> > My tests showed that, when the aggregator was set to
>>>> AverageAggregator, the
>>>> > average of all the vertice values from the past compute step were
>>>> returned.
>>>> > Actually, AverageAggregator should return the average difference of
>>>> all the
>>>> > old-new value pairs of every vertice instead of the mean.
>>>> > The average difference is then used to check whether convergence is
>>>> > reached, which is relevant for all task ofcourse.
>>>> >
>>>> > Hence, the convergence point, for which the Aggregator is used, will
>>>> not be
>>>> > reached.
>>>> > This thus makes it so that the algorithm will just run the maximum
>>>> number
>>>> > of iterations set (30 iterations on the PageRank example) in every
>>>> case.
>>>> > I experienced the same with my own PageRank implementation.
>>>> >
>>>> > I think it has something to do with the finalizeAggregation step taken.
>>>> > Next to that, both the 'aggregate(VERTEX vertex, M value)' and
>>>> > 'aggregate(VERTEX vertex, M oldValue, M newValue)' methods are called
>>>> every
>>>> > time, were one would think only the second (with old/new values) would
>>>> > suffice.
>>>> > Because of this, the global variable 'absoluteDifference' in the
>>>> > 'AbsDiffAggregator' class is overwriten/overruled by the first
>>>> aggregate.
>>>> > Additionally, if one would make its own Aggregation class in the same
>>>> > fashion as AbsDiffAggregator and AverageAggregator, but leave out the
>>>> > 'aggregate(VERTEX vertex, M value)', my output turned out to be 0.0000
>>>> > every time.
>>>> >
>>>> > I hope I made myself clear.
>>>> > Regards
>>>> >
>>>> >
>>>> > On Wed, Apr 17, 2013 at 11:57 AM, Edward J. Yoon <
>>>> edwardyoon@apache.org
>>>> > >wrote:
>>>> >
>>>> > > Thanks for your report.
>>>> > >
>>>> > > What's the meaning of 'all the values'? Please give me more details
>>>> > > about your problem.
>>>> > >
>>>> > > I didn't look at 'dangling links & aggregators' part of PageRank
>>>> > > example closely, but I think there's no bug. Aggregators is just
used
>>>> > > for global communication. For example, finding max value[1] can
be
>>>> > > done in only one iteration using MaxValueAggregator.
>>>> > >
>>>> > > 1.
>>>> http://cdn.dejanseo.com.au/wp-content/uploads/2011/06/supersteps.png
>>>> > >
>>>> > > On Wed, Apr 17, 2013 at 6:27 PM, Steven van Beelen <
>>>> smcvbeelen@gmail.com
>>>> > >
>>>> > > wrote:
>>>> > > > Hello,
>>>> > > >
>>>> > > > I'm creating my own pagerank in hama for a testing and I think
I
>>>> found
>>>> > a
>>>> > > > problem with the AverageAggregator. I'm not sure if it is
me or
>>>> the the
>>>> > > > AverageAggregator class in general, but I believe it just
returns
>>>> the
>>>> > > mean
>>>> > > > of all the values instead of the average difference between
the
>>>> old and
>>>> > > new
>>>> > > > value as intended.
>>>> > > >
>>>> > > > For testing, I created my own AbsDiffAggregator and
>>>> AverageAggregator
>>>> > > > classes, using FloatWritable instead of DoubleWritables. The
same
>>>> > problem
>>>> > > > still occured: I got a mean of all the values in the graph
instead
>>>> of
>>>> > an
>>>> > > > average difference.
>>>> > > >
>>>> > > > Could someone tell me if I'm doing something wrong or what
I should
>>>> > > provide
>>>> > > > to better explain my problem?
>>>> > > >
>>>> > > > Regards,
>>>> > > > Steven van Beelen, Vrije Universiteit of Amsterdam
>>>> > >
>>>> > >
>>>> > >
>>>> > > --
>>>> > > Best Regards, Edward J. Yoon
>>>> > > @eddieyoon
>>>> > >
>>>> >
>>>>
>>>
>>>
>>



--
Best Regards, Edward J. Yoon
@eddieyoon

Mime
View raw message