giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Maja Kabiljo (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (GIRAPH-266) Average aggregators don't calculate real average
Date Tue, 24 Jul 2012 11:16:33 GMT

    [ https://issues.apache.org/jira/browse/GIRAPH-266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421326#comment-13421326
] 

Maja Kabiljo commented on GIRAPH-266:
-------------------------------------

The current idea of aggregators doesn't really support functions like average, it supports
binary operations. What we can do:
1. Remove all average aggregators, user can still get average by using two sum aggregators
2. Create new classes, like DoubleAverageWritable, with method getAverage(). Then DoubleAverageAggregator
extends Aggregator<DoubleAverageWritable> and user will call getAverage() on aggregated
value to get average, and will have to aggregate DoubleAverageWritable(value, 1).
3. If we want to support this and potentially some other complicated aggregators, we could
also add a method to combine two aggregators, and make aggregators Writable. Master would
get whole aggregator objects from workers, and he would combine them to get the final result.
In that case we could also have input (values which we aggregate) and output (aggregated value)
of aggregator to be of different types. Although this maybe makes aggregators more complicated
than anyone will need.

What do you think?
                
> Average aggregators don't calculate real average
> ------------------------------------------------
>
>                 Key: GIRAPH-266
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-266
>             Project: Giraph
>          Issue Type: Bug
>            Reporter: Maja Kabiljo
>            Assignee: Maja Kabiljo
>
> Average aggregators calculate average on workers, and then workers' averages are sent
to the master, where we actually calculate the average of averages. 
> For example, if one worker aggregates value 2, the other values 4 and 6, and third values
8, 8 and 8, we'll get the result of 5 instead of 6.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message