I am in agreement with Selcuk's analysis. I did not presume just how nasty the inconsistency handling would get.

On Thu, May 10, 2012 at 8:18 PM, Selcuk AYA <ayaselcuk@gmail.com> wrote:
On Thu, May 10, 2012 at 5:51 AM, Emmanuel Lécharny <elecharny@gmail.com> wrote:
> Le 5/10/12 9:58 AM, Emmanuel Lécharny a écrit :
>
>> Le 5/10/12 7:57 AM, Selcuk AYA a écrit :
>>>
>>> The problem seems to be caused by the test
>>> testPagedSearchWrongCookie(). This tests failure in pages search by
>>> sending a bad cookie. After failing, it relies on ctx.close() to
>>> cleanup the session. Cleanup of the session will close all the cursors
>>> related to paged searches through the session.
>>>
>>> It seems that somehow ctx.close does not result in an unbind message
>>> at the server side time to time. I do not know what causes this but
>>> this leaves a cursor open(specifically a NoDups cursor on rdn index).
>>> Eventually as changes happen to the Rdn index, we run out of freeable
>>> cache headers. After ignoring this test, pagedsearchit and searchit
>>> pass fine together. It would be good to understand why arrival of
>>> unbind message is a hit and miss case in this test.
>>
>>
>> It's absolutly strange... Neither an UnbindRequest nor an AbandonRequest
>> is sent by JNDI when closing the context, which is a huge bug.
>>
>> I have checked the other tests, and an Ubind request is always sent when
>> we close teh context, except when we get an UnwillingToPerform exception.
>> It seems like the context is in a state where it considers that no unbind
>> should be send after an exception. Although I can do a lookup (and get back
>> the correct response from the server after this excption), the connection is
>> still borked :/
>>
>> I'll try to rewite the test using our API to see if it works better, and
>> investigate with som Sun guys to see if there is an issue in JNDI.
>>
>>
>>
> Ok, we have had a long discussion with Alex about this problem...
>
> The thing is that even for standard PagedSearch, where everything goes fine
> (ie, when the client is done, he has correctly closed the connextion, which
> sends a UbindRequest, which close the cursor etc), we may have dozens of
> opened cursors for some extend period of time.
>
> At some point, we may have a exhausted cache, with no way to evict any
> elements from it, leading to a server freeze.
>
> Not something we can accept from a LDAP server...
>
> A suggestion would be to add some parameter in the OperationContext telling
> the underlying layer that a search is done outside of any transaction. When
> we fetch an ID from an index, and we try to get the associated Entry from
> the master table, if we get an error  because the ID does not exist anymore,
> then we should just ignore the error, and continue the search.
>
> But we still want to be sure that in some case, inside the server, we still
> can have transactions over some searches.
>
> Thoughts ?
>

I dont think having non transactional search is a good idea. I agree
there is a problem with non closed cursors but I dont think this is
the right way to solve it. We currently do not have transactions for
the search but a cursor over the jdbm  B tree gets a snapshot view.
This snapshot view is not only for getting a snapshot view of the data
but also the structure itself. If you do not have this(and on top of
this if you dont have txns):

 - you will have to deal with inconsistencies in the Btree data structure
 - you might get data as NULL from the Btree and you might have to
deal with it. Or you might have to deal with cases like you counted 10
children but you actually end up with 9 children while doing a DFS
search over your data structure.This might look easy but I think it is
not.
 - you might get not only stale data but complete garbage. This
garbage might confuse the code completely(for example if the garbage
you read was supposed to be a Btree redirect).

Code from ldap protocol handlers down to search is written in a way
assuming cursors get consistent data. I dont think it is impossible to
write code expecting all kinds of inconsistencies but it is very
difficult and the code will be brittle.


As for the paged search, one way to deal with it would be to read all
the data from the cursors at the beginning of the paged search and
close the cursor. This would be similar to a normal search. If we get
worried about memory consumption of this, the entries to be returned
could be spilled over to temp files.You might say this might lead to
temp file that are never claimed but if there are not many of them
then no big deal. Users are supposed to deal with cleaning up their
contexts. Not doing is similar to opening file handles or socket
connections and never closing them. Such things are bound to create
problems.


>
>
> --
> Regards,
> Cordialement,
> Emmanuel Lécharny
> www.iktek.com
>

thanks
Selcuk



--
Best Regards,
-- Alex