directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Karasulu" <akaras...@apache.org>
Subject Re: Implementing the PagedSearchControl
Date Fri, 05 Dec 2008 15:48:23 GMT
Hi Emmanuel,

On Wed, Dec 3, 2008 at 6:25 PM, Emmanuel Lecharny <elecharny@gmail.com>wrote:

> The problem I have is the following : we have to remember the pointer to
> the last entry we have sent back to the client
>
> How should we do ? My first approach was pretty naive : we are using a
> cursor, so it's easy, we simply store the cursor into the session, and the
> next request will just have to get back this cursor from the session, and
> get the N next elements from this cursor.
>
> This has the advantage of being simple, but there are some very important
> cons :
> - it's memory consuming, as we may keep those cursor in the session for a
> very long time
> - we will have to close all the cursors when the session is closed (for
> whatever reason)
> - if some data has been modified since the cursor creation, it may contain
> invalid data
> - if the user don't send and abandon search request, those cursors will
> remain in the session until it's closed (this is very likely to happen)
>
> So I'm considering an alternative - though more expensive and less
> performant - approach :
> - we build a new cursor for each request,
> - we move forward the Nth entry in the newly created cursor, and return
> back the M requested elements
> - and when done, we discard the cursor.
>

I would avoid this approach.  The problem is that it requires almost a
factorial amount of computation as you scan back to the point you were at
before to advance the cursor.  Say you have 100 entries and you advance
reading the first 10.  Then create a new cursor and ask for the next 11-20
elements.  This means you'll scan through the first 10 elements checking if
each element is a match for the filter and as you know this shifts a nested
structure of cursors structured to reflect the logic of the filter.  So
you're doing a search for 10, then 20, 30, 40, 50, 60 and so on elements.


>
> The pros are
> - we don't have to keep n cursors in memory for ever.


The whole point to this feature is to maintain state so the search continues
where it left off.  But this should be cheap both for the server and for the
client. This approach is a brute force approach and it's going to mix up a
lot of code in complicated places.

It's OK to hold off on this until we see a better approach.  I'd rather wait
until we feel that eureka light bulb go off.


>
> - from the client POV, it respects the PagedSeach contract
> - it's easier to implement as we have less information to keep in the
> session, and to restore back
>
> The cons are :
> - it's time consuming, as if we have N entry to return, with a P page size,
> we will construct N/P cursors.
>

Yes and there will be costs to advances.  Both are going to make this
approach limiting.

Alex

Mime
View raw message