giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jakob Homan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (GIRAPH-192) Move aggregators to a seperate sub-package
Date Mon, 18 Jun 2012 22:08:42 GMT

    [ https://issues.apache.org/jira/browse/GIRAPH-192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13396304#comment-13396304
] 

Jakob Homan commented on GIRAPH-192:
------------------------------------

Hey Jan.  Had a look at the patch.  A few questions:
* What's the use case for the overwrite aggregator? It's nondeterministic in operation (particularly
in a distributed environment), so I'm having trouble seeing why it's needed?
* Rather than separate Int/Long and Float/Double aggregators, perhaps we can just have long
and float, respectively, and rely on the greater size/precision? Each aggregator will take
twice as much memory, but there should be relatively few of them so this may not be so bad.
Thoughts?
* It seems like it should be possible to parameterize these classes on pairs of both the Java
type and corresponding Writable version (ie <Long, LongWritable>).  Not great, but might
lead to less code.  Is this viable?
* The new aggregators will require unit tests and we may as well take the opportunity to add
unit tests for the existing ones (if not, at least add a newbie JIRA for adding them).
* What happens when we overflow on, for instance, product aggregator with an int?  This should
be tested as well.
                
> Move aggregators to a seperate sub-package
> ------------------------------------------
>
>                 Key: GIRAPH-192
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-192
>             Project: Giraph
>          Issue Type: Improvement
>          Components: examples
>    Affects Versions: 0.2.0
>            Reporter: Jan van der Lugt
>            Assignee: Jan van der Lugt
>            Priority: Minor
>             Fix For: 0.2.0
>
>         Attachments: GIRAPH-192.patch
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> Since aggregators will be re-used throughout many projects and algorithms, it makes sense
to implement the most common ones in a separate sub-package. This will reduce the time required
for users when they implement their projects based on Giraph, because the required aggregators
are already in place. I implemented the following ones:
> for int/long/float/double: min, max, product, sum, overwrite
> for boolean: and, or, overwrite
> Most of them speak for themselves, except for the overwrite one. This aggregator simply
overwrites the stored value when a new value is aggregated. This is useful when one node is
in some way a master node (for example a source node in an routing algorithm), and this node
wants to broadcast a value to all other nodes.
> Attached is a patch against trunk implementing the aggregators and patching some existing
files so they use the .aggregators package instead of the .examples one.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message