Return-Path: X-Original-To: apmail-accumulo-user-archive@www.apache.org Delivered-To: apmail-accumulo-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 927549076 for ; Tue, 10 Apr 2012 22:22:21 +0000 (UTC) Received: (qmail 6081 invoked by uid 500); 10 Apr 2012 22:22:21 -0000 Delivered-To: apmail-accumulo-user-archive@accumulo.apache.org Received: (qmail 5983 invoked by uid 500); 10 Apr 2012 22:22:21 -0000 Mailing-List: contact user-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@accumulo.apache.org Delivered-To: mailing list user@accumulo.apache.org Received: (qmail 5975 invoked by uid 99); 10 Apr 2012 22:22:21 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Apr 2012 22:22:21 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [209.85.216.169] (HELO mail-qc0-f169.google.com) (209.85.216.169) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Apr 2012 22:22:14 +0000 Received: by qcsd16 with SMTP id d16so226964qcs.0 for ; Tue, 10 Apr 2012 15:21:53 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding:x-gm-message-state; bh=FV+GJYNt9S97Qd4DegaG6MPNtnSRmKv4oGsH1O/1Hlo=; b=bJDCorAnsmwtthRGMfrNU+SW2L6Iuxhq2zjh/G8MD5kVwxAAoIwZl+TpnkisbRQgBF n1wDX2ZPmSr/NxMWLgUrCKnBlGPU6ro0evxAClmxjygszSTMMIbm4V+oT08YuasQ0H9b xrBC7Pbzx1ahv5+Xykk2hhJJ/HcnsYQ8T01FVn7QGhu2Fjxv3K8B/ZPRsrG7jSwRJfjV gP8mlQwasTOkbU0K6q/PSxuTG5SaOPnjrklji33jnf+Il20U+FsJaniapVlFDwbVwIC3 t69gMaF6SS+1nPY8NlwGoUN9+xDU8YFg4FyM9UGQ3gxfj4ENcC2Rz7GfqCv1LyDh6TWc ZZWw== MIME-Version: 1.0 Received: by 10.224.174.1 with SMTP id r1mr16998218qaz.20.1334096513635; Tue, 10 Apr 2012 15:21:53 -0700 (PDT) Received: by 10.229.161.208 with HTTP; Tue, 10 Apr 2012 15:21:53 -0700 (PDT) In-Reply-To: References: Date: Tue, 10 Apr 2012 18:21:53 -0400 Message-ID: Subject: Re: recursive rollup From: Keith Turner To: user@accumulo.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Gm-Message-State: ALoCoQmzbFO6S5sUPBu2SCs5CLT5lMg1Qu3HEJmKq8g3wtjmL/3cd6cvTWhDSEAQHtrQDmU9oI1O Take a look at the org.apache.accumulo.examples.simple.dirlist.FileCount example, I think it does what you are asking. The way it works it that assume your tree would be stored in accumulo as follows. 000/A 001/A/B 001/A/C 002/A/B/D 002/A/B/E 002/A/C/F 002/A/C/G The number before the path is the depth, therefore all data for a particular depth of the tree is stored contiguously. The example program scans each depth, starting with the highest depth, and push counts up. This very efficient because there is not a lot of random access, it sequentially reads data for each depth and stream updates to the lower depth. For example when reading level 002, it would do the following two inserts : 001/A/B count=3D2 001/A/C count=3D2 Then it would read level 001 and push the following insert : 000/A count =3D 5 Keith On Tue, Apr 10, 2012 at 6:02 PM, Perko, Ralph J wrot= e: > Hi, I wish to do a recursive rollup-count and am wondering the best way t= o > do this. > > What I mean is this =AD in accumulo is a table with data that represents = the > nodes and leafs in a tree. =A0Each node/leaf in accumulo knows it's paren= t > and children and wether it is a node or leaf. =A0I wish to have a > rollup-count for any given node to know the combined total of all > descendants. > > For example, given the tree: > > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0A > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0| > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 -------- > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 | =A0 =A0 =A0| > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 B =A0 =A0 =A0C > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 | =A0 =A0 =A0| > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 ----- =A0----- > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 | =A0 | =A0| =A0 =A0| > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 D =A0 E =A0F =A0 =A0G > > "A" would have 6 descendants. > > I can use the SummingCombiner iterator to get a child count, e.g "A" has = 2 > children, but I am not sure the best way to recurse down. =A0The data is > static so I do not necessarily need a dynamic, on-the-fly solution. > > Thanks for your help, > Ralph >