giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jakob Homan (JIRA)" <>
Subject [jira] [Commented] (GIRAPH-192) Move aggregators to a seperate sub-package
Date Mon, 18 Jun 2012 22:08:42 GMT


Jakob Homan commented on GIRAPH-192:

Hey Jan.  Had a look at the patch.  A few questions:
* What's the use case for the overwrite aggregator? It's nondeterministic in operation (particularly
in a distributed environment), so I'm having trouble seeing why it's needed?
* Rather than separate Int/Long and Float/Double aggregators, perhaps we can just have long
and float, respectively, and rely on the greater size/precision? Each aggregator will take
twice as much memory, but there should be relatively few of them so this may not be so bad.
* It seems like it should be possible to parameterize these classes on pairs of both the Java
type and corresponding Writable version (ie <Long, LongWritable>).  Not great, but might
lead to less code.  Is this viable?
* The new aggregators will require unit tests and we may as well take the opportunity to add
unit tests for the existing ones (if not, at least add a newbie JIRA for adding them).
* What happens when we overflow on, for instance, product aggregator with an int?  This should
be tested as well.
> Move aggregators to a seperate sub-package
> ------------------------------------------
>                 Key: GIRAPH-192
>                 URL:
>             Project: Giraph
>          Issue Type: Improvement
>          Components: examples
>    Affects Versions: 0.2.0
>            Reporter: Jan van der Lugt
>            Assignee: Jan van der Lugt
>            Priority: Minor
>             Fix For: 0.2.0
>         Attachments: GIRAPH-192.patch
>   Original Estimate: 2h
>  Remaining Estimate: 2h
> Since aggregators will be re-used throughout many projects and algorithms, it makes sense
to implement the most common ones in a separate sub-package. This will reduce the time required
for users when they implement their projects based on Giraph, because the required aggregators
are already in place. I implemented the following ones:
> for int/long/float/double: min, max, product, sum, overwrite
> for boolean: and, or, overwrite
> Most of them speak for themselves, except for the overwrite one. This aggregator simply
overwrites the stored value when a new value is aggregated. This is useful when one node is
in some way a master node (for example a source node in an routing algorithm), and this node
wants to broadcast a value to all other nodes.
> Attached is a patch against trunk implementing the aggregators and patching some existing
files so they use the .aggregators package instead of the .examples one.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message