datafu-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthew Hayes <>
Subject Re: why is data.fu implementing HyperLogLog as an accumulator and not as algebraic?
Date Sat, 07 Mar 2015 21:09:57 GMT
I don't remember if there was a particular reason I didn't implement this as AlgebraicEvalFunc.
It seems like it could be. I believe the Java MapReduce version leverages the combiner. If
you want to try making this Algebraic we would be happy to accept a patch :) 


> On Mar 7, 2015, at 12:11 PM, Ido Hadanny <> wrote:
> data.fu has a nice implementation of HyperLogLog for estimating cardinality
> here
> <>
> However, it's implemented as Accumulator which means it will run only at
> the reducer and not in the combiner (but it will never load the entire set
> into memory as in normal EvalFunc). Why couldn't data.fu implement it as
> Algebraic - and fill the registers at every combiner, then merge and reduce
> the result? Am I missing something here?
> also available here:
> thanks!
> -- 
> Sent from my androido

View raw message