--On Tuesday, June 06, 2006 9:33 AM -0400 Alex Karasulu wrote: > Quanah Gibson-Mount wrote: > >> >> >> --On Tuesday, June 06, 2006 12:50 AM -0400 "Noel J. Bergman" >> wrote: >> >>> Quanah Gibson-Mount wrote: >>> >>>> I think the concept of applying all indexing to attributes is in itself >>>> broken. >>> >>> >>> So is your suggestion that the option be made available, but that by >>> indexing selectively, Alex's concerns can be effectively addressed? Do >>> you have any suggestions as to how that might be provided without losing >>> ease-of-use for the most common cases? >> >> >> Well, in OpenLDAP, the way ease of use is met is by users being able >> to define a default index type or types. That way, they can specify >> the default set, and then just use index , similar to what >> is being done in Apache DS. >> >> I think it is important to allow specification of what indices to use >> for a given attribute for a few reasons. One, that you can use it to >> actually make some searches slow enough to hinder efforts (like we >> have a spam troller routinely trying to get data from our sources that >> is fairly obnoxious), another is that the more indices you have on an >> attribute, the larger the total database is, and the longer it takes >> to load. This of course depends on part in the OS/Cpu used as well. >> For example, I currently index 90 attributes in my database to varying >> degrees (most are eq, which is a fairly minimal index). > > Note that ApacheDS indices are very similar to OpenLDAP equality indices > with some minor differences for handling substring matching. The cost > is about the same as an eq index. So you get sub, eq, and existence for > the price of eq. Ah, okay. That is handy. >> On my Solaris sparc systems, it takes 2.5ish hours to load the database. >> > > What kind of sparc is that? I have a blade 2000 here if you want to try > a current machine benchmark. I could create an account for you to > test. Just let me know. My current systems are SunFire V120's, with a 650 MHz CPU and 4GB of RAM. I've played with more modern systems (8 CPU T2000's with 4-cores), and they are definitely faster, but the same underlying issue of needing to use a memory cache during bulk loads still applies, but that may be an OpenLDAP/BDB specific thing that ApacheDS wouldn't encounter. >> On my new AMD systems that'll be replacing the Sun Sparc boxes, it >> takes all of 14.5 minutes. However, if all 90 of those attributes >> were getting indexed pres,eq,sub, the amount of time to load would >> increase significantly. >> >> Currently, my indices take up 1.1GB of disk space in OpenLDAP (I'm not >> sure how that exactly map out in Apache DS). My database entry file >> takes 2.7GB. So my indices are approximately 1/3 of my database size. > > Yeah the cost of disk space is just about the same but that's the least > of our worries. Disk is cheap as Emmanuel stated. Nods, memory was more my concern, but it may not apply in the ApacheDS case, given the use of a different database backend and the difference in how indices are done. --Quanah -- Quanah Gibson-Mount Principal Software Developer ITS/Shared Application Services Stanford University GnuPG Public Key: http://www.stanford.edu/~quanah/pgp.html