directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Emmanuel L├ęcharny <elecha...@apache.org>
Subject Re: Modification of Index Entries through search engine
Date Wed, 30 Nov 2011 13:14:08 GMT
On 11/30/11 7:00 AM, Selcuk AYA wrote:
> On Wed, Nov 30, 2011 at 1:56 AM, Emmanuel Lecharny<elecharny@gmail.com>  wrote:
>> On 11/29/11 8:53 PM, Selcuk AYA wrote:
>>> Hi,
>>> I am doing some tests on txns and the following causes problems for the
>>> test:
>>>      most of the default search engine evaluators cast Index<UUID>    to
>>> Index<Object>    or Index<String>    and then call the setValue method
of
>>> the index entry to modify it  when there is no index on the attribute.
>> Ok.
>>
>> Index on attributes can be set on String value or Binary values. But as we
>> also use some index for entries (ie, Long, before you switched to UUID), we
>> had no way to simply decide what was the exact type of the index.
>>
>> This is bad. There is an old JIRA opened describing a similar issue :
>> https://issues.apache.org/jira/browse/DIRSERVER-1458
>>
>> It has to be fixed.
>>
>>> Index<UUID>    entry comes from the uuid cursor.  I would like to get rid
>>> of this as it is causing problems for the txns layer. Txn layer is
>>> maintaining some set of index entries in memory and when they are
>>> modified by the search engine it gets confused. In general any layer
>>> or cursor which maintained a cache of index entries would get confused
>>> in such a scenario.
>>>
>>> Please let me the reason we change the index entry in such a way and
>>> we can hopefully get rid of it. I commented out such code for the
>>> cases it caused me trouble and the tests passed.
>> I just think we change them to be able to handle all the different cases.
>> Not in the best possible way, I agree.
> So I still dont get why modifying the index entry through setValue at
> evaluators is necessary( the hacky way it is done is another point).

The pb is that when we grab the value of an attribute, before it's 
stored into the backend, we have to normalize it so that we can easily 
compare it later.

Let's say we inject an AT with a value like "ABC" (the AT being case 
insensitive).
Later, we try to look for any entry which AT is equal to "Abc".
Now, we have to compare "Abc" with "ABC", which is not convenient if the 
two values aren't normalized.
So we normalize the value when the entry is stored in the master table 
("ABC" -> "abc") and we also normalize the value of the filter ("Abc" -> 
"abc") before doing any comparison.

What struck me here is that such normalization should have happened way 
before we enter this portion of the code. Now, I can see why we have 
such a code : the original version was doing the normalization just 
there, not before (see this 7 years old original code : 
http://svn.apache.org/viewvc/directory/apacheds/trunk/xdbm-search/src/main/java/org/apache/directory/server/xdbm/search/impl/SubstringEvaluator.java?revision=47553&view=markup&pathrev=951388).

So we are living with code built on top of many ancient code, and there 
is some remaining dust we must clean.
> We have the id of the entry at this point and it is read from the
> master table to get the attribute value to evaluate. When you dont
> have an index, you have to have the entry fed to the evaluator either
> through master table or from some other node in the search tree anyway
> and then we have all the attribute values we need. So it seems to me
> redundant to set this value on the index value. I will comment out
> this code and try the tests. Please let me know if you are aware of a
> case where setting value on the index entry at evaluators is useful.
I don't think it's useful. Go ahead with the tests.

IMO, this portion of the code need to be reviewed...


-- 
Regards,
Cordialement,
Emmanuel L├ęcharny
www.iktek.com


Mime
View raw message