mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitriy Lyubimov <dlie...@gmail.com>
Subject Re: Quartiles computation with M/R or Pig (combine function states)
Date Fri, 20 Apr 2012 21:12:25 GMT
Thank you, sir. Let me consider this.

On Fri, Apr 20, 2012 at 11:50 AM, Hector Yee <hector.yee@gmail.com> wrote:
> how about this
>
> http://en.wikipedia.org/wiki/Reservoir_sampling
>
> On Fri, Apr 20, 2012 at 10:44 AM, Dmitriy Lyubimov <dlieu.7@gmail.com>wrote:
>
>> Hello,
>>
>> There should be some way to compile quartiles in a map/reduce fashion
>> (i.e. with api similar to Pig's Arithmetic custom function) without
>> keeping enormous count hash?
>> There's this countsketch thing that i implemented before on map
>> reduce, but it is sort of like bloom filter: if it gives a wrong
>> result, the error is fairly huge (in case of bloom filter, 100%) and
>> to get good results it still requires quite a bit of memory
>>
>
>
>
> --
> Yee Yang Li Hector <https://plus.google.com/106746796711269457249>
> Professional Profile <http://www.linkedin.com/in/yeehector>
> http://hectorgon.blogspot.com/ (tech + travel)
> http://hectorgon.com (book reviews)

Mime
View raw message