directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Emmanuel L├ęcharny <>
Subject Re: HBase partition integration in trunks ?
Date Tue, 16 Aug 2011 07:53:48 GMT
On 8/15/11 5:59 PM, Stefan Seelmann wrote:
> Now I have to update the parts that are a bit special, let me explain:
> In HBase partition I didn't use one-level and sub-level indices, but
> use the RDN index table instead. I also extended the search engine in
> that way that one-level and sub-level cursors get the search filter in
> order to perform filtering within the store instead of returning all
> candidates and evaluate them.
Some toughts about this one-level/sub-level index.

Using the Rdn index makes perfect sense : we have the Rdn -> parent 
relation plus the parent -> children relation in this index, so there is 
no need to have a one level index (all the children are already listed 
in the RDN index for a specific entry). I'm a bit more concerned about 
the sub-level processing : we have to recurse on all the children to get 
all the candidates. That's fine, we can easily implement that (and you 
already did), but what concerns me is that we don't have the count of 
all the entries, we will have to compute them. This count is necessary 
in the search engine to select the index we will use to walk the entries.

One solution would be to store two more elements in the ParentIdAndRdn 
data structure : the number of children directly below the RDN, and the 
number of children and descendant. That would probably solve the issue 
I'm mentioning. Of course, that also means we wil have to update all the 
RDN hierarchy from top to bottom (but affecting only the RDN part of the 
entry DN) each time we add/move/delete an entry. Note that we already do 
that for the oneLevel and Sublevel index.

All in all, I do think this is feasable, and you probably already have 
implemented such logic in the HBase partition.

Can you tell me if what I wrote above makes sense for HBase but also for 
the whole system ?

If we could get rid of the one-level/sub-level index, we would speed-up 
the add/move/delete operation greatly (as we will spare two index 
update), saving probably 25% of the time needed to update the backend 
(we will just have 5 index to update instead of 7). It might also speed 
up the search marginally, as we won't have to do  look-up in the 
one-level or sub-level index to build the scope filter.

Emmanuel L├ęcharny

View raw message