accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Keith Turner <ke...@deenlo.com>
Subject Re: recursive rollup
Date Tue, 10 Apr 2012 22:23:21 GMT
oooppsss I was not paying close attention,  when it scans level 001 it
will insert the following

  000/A count = 6 (not 5)

On Tue, Apr 10, 2012 at 6:21 PM, Keith Turner <keith@deenlo.com> wrote:
> Take a look at the
> org.apache.accumulo.examples.simple.dirlist.FileCount example,  I
> think it does what you are asking.
>
> The way it works it that assume your tree would be stored in accumulo
> as follows.
>
> 000/A
> 001/A/B
> 001/A/C
> 002/A/B/D
> 002/A/B/E
> 002/A/C/F
> 002/A/C/G
>
> The number before the path is the depth, therefore all data for a
> particular depth of the tree is stored contiguously.  The example
> program scans each depth, starting with the highest depth, and push
> counts up.  This very efficient because there is not a lot of random
> access, it sequentially reads data for each depth and stream updates
> to the lower depth.
>
> For example when reading level 002, it would do the following two inserts :
>
>  001/A/B count=2
>  001/A/C count=2
>
> Then it would read level 001 and push the following insert :
>
>  000/A count = 5
>
> Keith
>
> On Tue, Apr 10, 2012 at 6:02 PM, Perko, Ralph J <Ralph.Perko@pnnl.gov> wrote:
>> Hi, I wish to do a recursive rollup-count and am wondering the best way to
>> do this.
>>
>> What I mean is this ­ in accumulo is a table with data that represents the
>> nodes and leafs in a tree.  Each node/leaf in accumulo knows it's parent
>> and children and wether it is a node or leaf.  I wish to have a
>> rollup-count for any given node to know the combined total of all
>> descendants.
>>
>> For example, given the tree:
>>
>>                          A
>>                          |
>>                       --------
>>                       |      |
>>                       B      C
>>                       |      |
>>                     -----  -----
>>                     |   |  |    |
>>                     D   E  F    G
>>
>> "A" would have 6 descendants.
>>
>> I can use the SummingCombiner iterator to get a child count, e.g "A" has 2
>> children, but I am not sure the best way to recurse down.  The data is
>> static so I do not necessarily need a dynamic, on-the-fly solution.
>>
>> Thanks for your help,
>> Ralph
>>

Mime
View raw message