Return-Path: X-Original-To: apmail-giraph-dev-archive@www.apache.org Delivered-To: apmail-giraph-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D53649101 for ; Tue, 24 Jul 2012 11:16:35 +0000 (UTC) Received: (qmail 44036 invoked by uid 500); 24 Jul 2012 11:16:35 -0000 Delivered-To: apmail-giraph-dev-archive@giraph.apache.org Received: (qmail 43892 invoked by uid 500); 24 Jul 2012 11:16:35 -0000 Mailing-List: contact dev-help@giraph.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@giraph.apache.org Delivered-To: mailing list dev@giraph.apache.org Received: (qmail 43875 invoked by uid 500); 24 Jul 2012 11:16:35 -0000 Delivered-To: apmail-incubator-giraph-dev@incubator.apache.org Received: (qmail 43852 invoked by uid 99); 24 Jul 2012 11:16:34 -0000 Received: from issues-vm.apache.org (HELO issues-vm) (140.211.11.160) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 24 Jul 2012 11:16:34 +0000 Received: from isssues-vm.apache.org (localhost [127.0.0.1]) by issues-vm (Postfix) with ESMTP id 92BB2140CCD for ; Tue, 24 Jul 2012 11:16:34 +0000 (UTC) Date: Tue, 24 Jul 2012 11:16:33 +0000 (UTC) From: "Maja Kabiljo (JIRA)" To: giraph-dev@incubator.apache.org Message-ID: <243141479.95180.1343128594603.JavaMail.jiratomcat@issues-vm> In-Reply-To: <472686660.95173.1343128474583.JavaMail.jiratomcat@issues-vm> Subject: [jira] [Commented] (GIRAPH-266) Average aggregators don't calculate real average MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/GIRAPH-266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421326#comment-13421326 ] Maja Kabiljo commented on GIRAPH-266: ------------------------------------- The current idea of aggregators doesn't really support functions like average, it supports binary operations. What we can do: 1. Remove all average aggregators, user can still get average by using two sum aggregators 2. Create new classes, like DoubleAverageWritable, with method getAverage(). Then DoubleAverageAggregator extends Aggregator and user will call getAverage() on aggregated value to get average, and will have to aggregate DoubleAverageWritable(value, 1). 3. If we want to support this and potentially some other complicated aggregators, we could also add a method to combine two aggregators, and make aggregators Writable. Master would get whole aggregator objects from workers, and he would combine them to get the final result. In that case we could also have input (values which we aggregate) and output (aggregated value) of aggregator to be of different types. Although this maybe makes aggregators more complicated than anyone will need. What do you think? > Average aggregators don't calculate real average > ------------------------------------------------ > > Key: GIRAPH-266 > URL: https://issues.apache.org/jira/browse/GIRAPH-266 > Project: Giraph > Issue Type: Bug > Reporter: Maja Kabiljo > Assignee: Maja Kabiljo > > Average aggregators calculate average on workers, and then workers' averages are sent to the master, where we actually calculate the average of averages. > For example, if one worker aggregates value 2, the other values 4 and 6, and third values 8, 8 and 8, we'll get the result of 5 instead of 6. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira