accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Cordova <aa...@cordovas.org>
Subject Re: org.apache.accumulo.core.iterators.Combiner: key scope?
Date Mon, 19 Mar 2012 20:09:23 GMT
I suppose this would be a bad time to bring up the idea of returning more than one Pair ..


The original semantics of reduce() from lisp is to compact everything down into one object
.. but the original MapReduce semantics allow reduce and map functions to emit() as many new
KV pairs as one desires. To bring Accumulo's reduce() function closer to the usage of MapReduce's
reduce() might not introduce a huge amount of cognitive load on users, especially if they
are coming from the MapReduce world.

However, I am strongly in favor of avoiding over-generalized and complicated APIs, and am
certainly willing to deal with the constraint of only returning one Pair if everyone feels
this will keep adoption and usage easy and simple.


On Mar 19, 2012, at 4:02 PM, Keith Turner wrote:

> On Mon, Mar 19, 2012 at 3:50 PM, Billie J Rinaldi
> <billie.j.rinaldi@ugov.gov> wrote:
>> Another thing to consider is what to do with the differing column qualifiers.  Throw
them away, returning a blank column qualifier on the single Key returned?  What if we want
to combine column qualifiers and ignore Values instead?  Should we try to pass the qualifiers
into a reduce method with the Values?  That would be a more general approach, but I'm not
sure how to create an API that wouldn't be messy.
>> 
>> Billie
> 
> Billie
> 
> The following API might address the issues you raised
> 
> public abstract Pair<Key, Value> reduce(Iterator<Pair<Key,Value>> iter)
> 
> Keith


Mime
View raw message