accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Medinets <david.medin...@gmail.com>
Subject Re: Using Accumulo To Calculate Seven Day Rolling Average
Date Sat, 19 May 2012 01:20:58 GMT
I'm replying a little late but Combiners replace the original values.
Therefore, I don't think they can be used to calculate the kind of
rolling averages I am calculating. There are other kinds of moving
averages that don't depend historical data but frankly I don't
remember their names.

On Thu, Apr 12, 2012 at 10:25 PM, Billie J Rinaldi
<billie.j.rinaldi@ugov.gov> wrote:
> You could alternatively use a Combiner like the following to calculate the average (though
I haven't tested this bit of code).  You would configure this as a scan-time iterator (either
a persistent scan iterator for the table, or attached to a particular Scanner) and would use
the STRING encoding type of the LongCombiner.  Not that it would be necessarily better to
use a Combiner to average together 7 things, but I thought it would make a good example.
>
> public class AveragingCombiner extends LongCombiner {
>  @Override
>  public Long typedReduce(Key key, Iterator<Long> iter) {
>    long sum = 0;
>    long count = 0;
>    while (iter.hasNext()) {
>      sum = safeAdd(sum, iter.next());
>      count++;
>    }
>    return sum/count;
>  }
> }
>
> Billie
>
>
> ----- Original Message -----
>> From: "David Medinets" <david.medinets@gmail.com>
>> To: user@accumulo.apache.org
>> Sent: Wednesday, April 11, 2012 10:59:46 PM
>> Subject: Using Accumulo To Calculate Seven Day Rolling Average
>> Thanks. Using this technique seems to work. I wrote a blog entry to
>> document it:
>>
>> Using Accumulo To Calculate Seven Day Rolling Average
>> http://affy.blogspot.com/2012/04/using-accumulo-to-calculate-seven-day.html
>>
>> On Wed, Apr 11, 2012 at 2:20 PM, Adam Fuchs <adam.p.fuchs@ugov.gov>
>> wrote:
>> > David,
>> >
>> > In case of continuing confusion, I think it's best if you ignore
>> > Bill's
>> > suggestion for now and heed Josh's advice. Bill's suggestion might
>> > be an
>> > optimization to look at later on, but your initial approach seems
>> > sound.
>> >
>> > Adam
>> >
>> >
>> >
>> > On Tue, Apr 10, 2012 at 10:52 PM, David Medinets
>> > <david.medinets@gmail.com>
>> > wrote:
>> >>
>> >> I thought there were issues associated with doing mutations inside
>> >> iterators?
>> >>
>> >> On Tue, Apr 10, 2012 at 10:35 PM, William Slacum
>> >> <wslacum@gmail.com>
>> >> wrote:
>> >> > I don't think you'd necessarily need a an aggregator for that,
>> >> > although
>> >> > it doesn't seem like that's what you're doing here in the first
>> >> > place.
>> >> > Wouldn't it be easier to set a summation iterator that also keeps
>> >> > a count of
>> >> > of observations to do some server side math and then combine it
>> >> > all on the
>> >> > client? That way you can have a time series and to get weekly
>> >> > averages you
>> >> > just change your scan range.
>> >> > On Apr 10, 2012, at 10:16 PM, David Medinets wrote:
>> >> >
>> >> >> I'm still thinking about how to use accumulo to calculate weekly
>> >> >> moving averages. I thought that using the maxVersions settings
>> >> >> might
>> >> >> work to maintain the last 7 values. Then a program could simply
>> >> >> sum
>> >> >> the values of a given row. So this is what I did:
>> >> >>
>> >> >> bin/accumulo shell -u root -p password
>> >> >>> createtable rolling
>> >> >> rolling> config -t rolling -s
>> >> >> table.iterator.scan.vers.opt.maxVersions=7
>> >> >> rolling> insert row cf cq 1
>> >> >> rolling> insert row cf cq 2
>> >> >> rolling> insert row cf cq 3
>> >> >> rolling> insert row cf cq 4
>> >> >> rolling> insert row cf cq 5
>> >> >> rolling> insert row cf cq 6
>> >> >> rolling> insert row cf cq 7
>> >> >> rolling> insert row cf cq 8
>> >> >> rolling> scan
>> >> >> row cf:cq [] 8
>> >> >> row cf:cq [] 7
>> >> >> row cf:cq [] 6
>> >> >> row cf:cq [] 5
>> >> >> row cf:cq [] 4
>> >> >> row cf:cq [] 3
>> >> >> row cf:cq [] 2
>> >> >>
>> >> >> This is exactly what I wanted to see. So I wrote a simple
>> >> >> scanner
>> >> >> program to read the table. Then I did another scan:
>> >> >>
>> >> >> rolling> scan
>> >> >> row cf:cq [] 8
>> >> >>
>> >> >> Where did the rest of the records go?
>> >> >
>> >
>> >

Mime
View raw message