mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: Is OnlineSummarizer mergeable?
Date Wed, 07 Aug 2013 21:48:20 GMT
Otis,

What statistics do you need?

What guarantees?



On Wed, Aug 7, 2013 at 1:26 PM, Otis Gospodnetic <otis_gospodnetic@yahoo.com
> wrote:

> Hi Ted,
>
> I'm actually trying to find an alternative to QDigest (the stream-lib impl
> specifically) because even though it seems good, we have to deal with crazy
> volumes of data in SPM (performance monitoring service, see signature)...
> I'm hoping we can find something that has both a lower memory footprint
> than QDigest AND that is mergeable a la QDigest.  Utopia?
>
> Thanks,
> Otis
> ----
> Performance Monitoring for Solr / ElasticSearch / Hadoop / HBase -
> http://sematext.com/spm
>
>
>
>
> >________________________________
> > From: Ted Dunning <ted.dunning@gmail.com>
> >To: "user@mahout.apache.org" <user@mahout.apache.org>
> >Sent: Wednesday, August 7, 2013 4:51 PM
> >Subject: Re: Is OnlineSummarizer mergeable?
> >
> >
> >It isn't as mergeable as I would like.  If you have randomized record
> >selection, it should be possible, but perverse ordering can cause serious
> >errors.
> >
> >It would be better to use something like a Q-digest.
> >
> >http://www.cs.virginia.edu/~son/cs851/papers/ucsb.sensys04.pdf
> >
> >
> >
> >
> >On Wed, Aug 7, 2013 at 4:21 AM, Otis Gospodnetic <
> otis.gospodnetic@gmail.com
> >> wrote:
> >
> >> Hi,
> >>
> >> Is OnlineSummarizer algo "mergeable"?
> >>
> >> Say that we compute a percentile for some metric for time 12:00-12:01
> >> and store that somewhere, then we compute it for 1201-12:02 and store
> >> that separately, and so on.
> >>
> >> Can we then later merge these computed and previously stored
> >> percentile "instances" and get an accurate value?
> >>
> >> Thanks,
> >> Otis
> >> --
> >> Performance Monitoring -- http://sematext.com/spm
> >> Solr & ElasticSearch Support -- http://sematext.com/
> >>
> >
> >
> >

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message