Return-Path: X-Original-To: apmail-accumulo-user-archive@www.apache.org Delivered-To: apmail-accumulo-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6046E9CFB for ; Fri, 13 Apr 2012 02:25:39 +0000 (UTC) Received: (qmail 36802 invoked by uid 500); 13 Apr 2012 02:25:39 -0000 Delivered-To: apmail-accumulo-user-archive@accumulo.apache.org Received: (qmail 36782 invoked by uid 500); 13 Apr 2012 02:25:39 -0000 Mailing-List: contact user-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@accumulo.apache.org Delivered-To: mailing list user@accumulo.apache.org Received: (qmail 36774 invoked by uid 99); 13 Apr 2012 02:25:39 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 13 Apr 2012 02:25:39 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [206.112.75.239] (HELO iron-e-outbound.osis.gov) (206.112.75.239) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 13 Apr 2012 02:25:34 +0000 X-IronPort-AV: E=Sophos;i="4.75,414,1330923600"; d="scan'208";a="97529445" Received: from netmgmt.ext.intelink.gov (HELO ww4.ugov.gov) ([172.16.11.235]) by iron-e-outbound.osis.gov with ESMTP; 12 Apr 2012 22:24:20 -0400 Date: Fri, 13 Apr 2012 02:25:12 +0000 (GMT+00:00) From: Billie J Rinaldi To: user@accumulo.apache.org Message-ID: <1773492170.388232.1334283912372.JavaMail.root@linzimmb04o.imo.intelink.gov> In-Reply-To: Subject: Re: Using Accumulo To Calculate Seven Day Rolling Average MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [10.2.188.66] X-Mailer: Zimbra 6.0.7_GA_2476.RHEL4 (ZimbraWebClient - SAF3 (Mac)/6.0.7_GA_2473.RHEL5_64) X-Virus-Checked: Checked by ClamAV on apache.org You could alternatively use a Combiner like the following to calculate the average (though I haven't tested this bit of code). You would configure this as a scan-time iterator (either a persistent scan iterator for the table, or attached to a particular Scanner) and would use the STRING encoding type of the LongCombiner. Not that it would be necessarily better to use a Combiner to average together 7 things, but I thought it would make a good example. public class AveragingCombiner extends LongCombiner { @Override public Long typedReduce(Key key, Iterator iter) { long sum = 0; long count = 0; while (iter.hasNext()) { sum = safeAdd(sum, iter.next()); count++; } return sum/count; } } Billie ----- Original Message ----- > From: "David Medinets" > To: user@accumulo.apache.org > Sent: Wednesday, April 11, 2012 10:59:46 PM > Subject: Using Accumulo To Calculate Seven Day Rolling Average > Thanks. Using this technique seems to work. I wrote a blog entry to > document it: > > Using Accumulo To Calculate Seven Day Rolling Average > http://affy.blogspot.com/2012/04/using-accumulo-to-calculate-seven-day.html > > On Wed, Apr 11, 2012 at 2:20 PM, Adam Fuchs > wrote: > > David, > > > > In case of continuing confusion, I think it's best if you ignore > > Bill's > > suggestion for now and heed Josh's advice. Bill's suggestion might > > be an > > optimization to look at later on, but your initial approach seems > > sound. > > > > Adam > > > > > > > > On Tue, Apr 10, 2012 at 10:52 PM, David Medinets > > > > wrote: > >> > >> I thought there were issues associated with doing mutations inside > >> iterators? > >> > >> On Tue, Apr 10, 2012 at 10:35 PM, William Slacum > >> > >> wrote: > >> > I don't think you'd necessarily need a an aggregator for that, > >> > although > >> > it doesn't seem like that's what you're doing here in the first > >> > place. > >> > Wouldn't it be easier to set a summation iterator that also keeps > >> > a count of > >> > of observations to do some server side math and then combine it > >> > all on the > >> > client? That way you can have a time series and to get weekly > >> > averages you > >> > just change your scan range. > >> > On Apr 10, 2012, at 10:16 PM, David Medinets wrote: > >> > > >> >> I'm still thinking about how to use accumulo to calculate weekly > >> >> moving averages. I thought that using the maxVersions settings > >> >> might > >> >> work to maintain the last 7 values. Then a program could simply > >> >> sum > >> >> the values of a given row. So this is what I did: > >> >> > >> >> bin/accumulo shell -u root -p password > >> >>> createtable rolling > >> >> rolling> config -t rolling -s > >> >> table.iterator.scan.vers.opt.maxVersions=7 > >> >> rolling> insert row cf cq 1 > >> >> rolling> insert row cf cq 2 > >> >> rolling> insert row cf cq 3 > >> >> rolling> insert row cf cq 4 > >> >> rolling> insert row cf cq 5 > >> >> rolling> insert row cf cq 6 > >> >> rolling> insert row cf cq 7 > >> >> rolling> insert row cf cq 8 > >> >> rolling> scan > >> >> row cf:cq [] 8 > >> >> row cf:cq [] 7 > >> >> row cf:cq [] 6 > >> >> row cf:cq [] 5 > >> >> row cf:cq [] 4 > >> >> row cf:cq [] 3 > >> >> row cf:cq [] 2 > >> >> > >> >> This is exactly what I wanted to see. So I wrote a simple > >> >> scanner > >> >> program to read the table. Then I did another scan: > >> >> > >> >> rolling> scan > >> >> row cf:cq [] 8 > >> >> > >> >> Where did the rest of the records go? > >> > > > > >