On Sun, Mar 23, 2008 at 8:17 PM, Howard Chu <hyc@symas.com> wrote:
Alex Karasulu wrote:
> On Sun, Mar 23, 2008 at 6:15 AM, Howard Chu <hyc@symas.com
> <mailto:hyc@symas.com>> wrote:
>
>     Using cursors for walking entry lists (and saving the cursor state) is
>     certainly useful inside the server, but there's nothing you can
>     safely gain
>     from the client side.
>
>     It kind of sounds like you're talking about Virtual List Views, not
>     paged
>     results. Remember that search responses in LDAP/X.500 are unordered by
>     definition. Therefore it makes no sense for a standards-compliant
>     client to
>     send an initial request with a Paging control saying "start at responses
>     200-300" because the order in which entries will be returned is not
>     defined.
>     You need something like VLV which requires SSS to even begin
>     thinking about
>     this; it's not a job for Paged Results.
>
>
> Yes you're totally right, I've confused the two controls. Thanks for the
> correction.
>
> Either way we still need to implement both. Does OpenLDAP implements
> both of these controls? Any opinions or advice regarding implementation
> and or the actual utility of these controls?

OpenLDAP currently has a no-op implementation of SSS and no VLV. There's been
some discussion about implementing them, but no one has been interested enough
to do it so far. The main problem being that SSS gets to be rather annoying in
the context of a truly distributed DIT, plus the CPU time involved makes the
whole proposition unscalable. This is one of those cases where, given 100s of
clients talking to a single server, it makes more sense to distribute the work
to the clients than to run hundreds of sort variations on the single server.

Netscape/iPlanet obviously took the opposite view, and decided there would
probably be small enough variation in the types of searches being used that
you could index them explicitly to avoid the overhead. We may follow on that
road as well, and just return UnwillingToPerform for searches that were not
indexed.

This is a sane strategy.  To implement SSS you need an index on the attribute to sort results on and you're just not going to have that all the time.  Just one bad search can negatively impact other concurrent search requests by drawing more CPU or unnecessarily turning over the cache with full table scans.

A while back I got an idea on this that might have some value.  The strategy above could be taken with an optional referral to a replica that may contain the index to satisfy the VLV-SSS request.  Often I found having one heavily indexed replica (the fat bastard server), helps with ad hoc queries composing 2-5% of the search requests.  The other 95-98% of the search requests could then be handled by the standard set of slim replicas.  So if any of the slim replicas find they cannot efficiently handle the request due to a lack of indices, they would then send back a referral to the fat bastard.

The slim replicas should be able to service the majority of queries rapidly sending only those few requests they cannot to the fat bastard.  The fat bastard would do more work to keep up with DIT changes but it should avoid breathing hard on the ad hoc searches.  The additional index maintenance overhead on the fat bastard is worth it if it can keep these nasty queries off of slim replicas hence saving them from performance degradation and cache turn over. 

Alex