directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Howard Chu <...@symas.com>
Subject Re: [LDAP] [Client] Client side Cursors can help w/ LDAP paging and notification
Date Sun, 23 Mar 2008 10:15:01 GMT
Using cursors for walking entry lists (and saving the cursor state) is 
certainly useful inside the server, but there's nothing you can safely gain 
from the client side.

It kind of sounds like you're talking about Virtual List Views, not paged 
results. Remember that search responses in LDAP/X.500 are unordered by 
definition. Therefore it makes no sense for a standards-compliant client to 
send an initial request with a Paging control saying "start at responses 
200-300" because the order in which entries will be returned is not defined. 
You need something like VLV which requires SSS to even begin thinking about 
this; it's not a job for Paged Results.

Even if you have a stable ordering (which SSS is actually unable to guarantee) 
you still can't reliably identify response #200 since underlying entries may 
be added or deleted while the search is progressed.

Emmanuel Lecharny wrote:
> Using cursors into ADS will also allow us to implement the Paging
> control (RFC 2696) so easily ! Even defining a new control (and a new
> RFC) as we will be able to go back and forth, which is not possible
> with the Paging control.

>> The Partition interface in ApacheDS will soon expose search results by
>> returning a Cursor<ServerEntry>  instead of a NamingEnumeration.
>
> It will be Cursor<Entry>  (as this is the top level interface)
>
> Depending
>> on the filter used, this is a composite Cursor which leverages partition
>> indices to position itself without having to buffer results.  This allows
>> the server to pull entries satisfying the search scope and filter one at a
>> time, and return it to the client without massive latency, or memory
>> consumption.  It also means we can process many more concurrent requests as
>> well as process single requests faster.  In addition a resultant search
>> Cursor can be advanced or position using the methods described above just by
>> having nested Cursors based on indicies advance or jump to the appropriate
>> Index positions.  We already have some of these footprint benefits with
>> NamingEnumerations, however the positioning and advancing capabilities are
>> not present with NamingEnumerations.
>
> Further experiments and researches will help a lot here. We may have
> problems too, as this will be a concurrent part : some of the data may
> be modified while the cursor is being read.
>
>> During the course of this work, I questioned whether or not client side
>> Cursors would be possible.  Perhaps not under the current protocol without
>> some controls or extended operations.  Technical barriers in the protocol
>> aside, I started to dream about how this neat capability could impact
>> clients.  With it, clients can advance or position themselves over the
>> results of a search as they like.  Clients may even be able to freeze a
>> search in progress, by having the server tuck away the server side Cursor's
>> state, to be revisited later.
>
> The major improvement with Client cursors is that the client won't
> have anymore to manage a cache of data. Thinking about the Studio, if
> you browse a big tree with thousands of entries, when you want to get
> the entries from [200-300] - assuming you show entries by 100 blocks -
> you have to send another search request _or_ you have to cache all the
> search results in memory. What a waste of time or a waste of memory !
> If we provide such a mechanism, the client won't have to bother with
> such complexity. Data will be brought to the client pieces by pieces :
> if the client want numbe 400 to 500, no need to get the 499 first
> entries. If the client already pumped out the first 100 entries, it's
> just a simple request on the same cursor, no need to compute it again.
>
> So, yes, client cursors make sense too.
>
> For lack of terms I've likened this to a form
>> of asynchronous bidirectional LDAP search. This would eliminate the need to
>> bother with paging controls.  It could even be used to eliminate the thread
>> per search problem associated with persistent search.  OK, let me stop
>> dreaming and start looking at reality so we can determine if this is even a
>> possibility.
>
> Reality is just a dream became true :) (sometime, it's a nightmare :)
>
>> So these characteristics of a Cursor have a profound impact on the semantics
>> of a search operation - not talking about the protocol yet.  I'm referring
>> to search as seen from the perspective of client callers using the Cursor:
>> the front end.  As stated search operations can be initiated and shelved to
>> persist the state of the search by tucking away the Cursor in the connection
>> session.  A Cursor for a search will automatically track it's position.
>>
>> However the protocol imposes some limitations on being able to leverage
>> these capabilities across the network on an LDAP client.  A search request
>> begins the search, and entry responses are received from the server, until
>> the server returns a search response done operation which  signals the end
>> of the search operation.  During this sequence, without creative extended
>> operations, or controls, there's little the client can do to influence the
>> entries returned by the server or throttle the rate of return.  Of course
>> size and time limits can be set on the search request but after issuing the
>> search, these cannot be altered.  Because the LdapMessage envelop contains a
>> messageId, and all responses contain the messageId of the request they
>> correspond to, the protocol allows for multiple requests to be issued in
>> parallel.  Even if client API's do not allow for it, this is certainly
>> possible.
>
> The main point is that each client is associated with a session. It's
> then easy to handle a context and use it to store meta data (like a
> previously created cursor on some search request, cursor which can be
> reused if the underlying data have not been modified).
>
> That bring another matter on the table : if we want to reuse cursors,
> we _must_ implement a decent entry cache.
>> Although I've long forgotten how the paging control works exactly, I still
>> have a rough idea: forgive me for my laziness and if I'm missing something.
>> A control specifies some number of results to return per page, and the
>> server complies by limiting the search to that number then capping off the
>> search operation with a search result done.  Cookies in the request and
>> response controls are used to track the progress, so another search request
>> for the next page returns the next page rather than initiating the search
>> from the start.  This breaks a big search up into many smaller search
>> requests.
>
> This is true from the client perspective. On the server, there should
> be only one search, and the returned results are just waiting for
> another search with the same cookie.
>
> This way the client has the ability to intervene in what
>> otherwise would be a long flood of results in a single large search
>> operation.  If this page control could also convey information about
>> positioning, and directionality, along with a page size set to 1, we could
>> implement client side Cursors with the same capabilities they posses on the
>> server.
>
> Exactly ! For instance, using negative size would result if going
> backward. This is a very minor extension to the paged search RFC, and
> it can even be implemented using the very same control, simply adding
> some semantic to it.
>
> Another extension would be to add a 'position' to start with.
>
> Paging search results effectively has the server tucking away the
>> search Cursor state into the client session and pulling it out again to
>> continue.  This is how we would implement this control today (that is if
>> anyone gets the time to do so :) ).

>> BTW change notifications are probably best implemented as a combination of
>> search and extended operations through unsolicited notifications.   The
>> client issues a search request with a control similar to the persistent
>> search request control.  Instead of 'persisting' the search, the search
>> returns immediately with a search result done response using a result code
>> to indicate whether or not the server will honor the request to be notified
>> of changes.
>
> This is a big semantic shift... Not sure that it will fit with the
> current LDAP protocol. However, LDAP V4 does not exist yet ;)

There's no reason this approach can't be used in LDAPv3. Just that no existing 
LDAPv3 clients or servers have support for such a control at the moment.
-- 
   -- Howard Chu
   Chief Architect, Symas Corp.  http://www.symas.com
   Director, Highland Sun        http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP     http://www.openldap.org/project/

Mime
View raw message