directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Quanah Gibson-Mount <qua...@stanford.edu>
Subject Re: Various questions
Date Tue, 06 Jun 2006 19:59:48 GMT


--On Tuesday, June 06, 2006 9:33 AM -0400 Alex Karasulu 
<aok123@bellsouth.net> wrote:

> Quanah Gibson-Mount wrote:
>
>>
>>
>> --On Tuesday, June 06, 2006 12:50 AM -0400 "Noel J. Bergman"
>> <noel@devtech.com> wrote:
>>
>>> Quanah Gibson-Mount wrote:
>>>
>>>> I think the concept of applying all indexing to attributes is in itself
>>>> broken.
>>>
>>>
>>> So is your suggestion that the option be made available, but that by
>>> indexing selectively, Alex's concerns can be effectively addressed?  Do
>>> you have any suggestions as to how that might be provided without losing
>>> ease-of-use for the most common cases?
>>
>>
>> Well, in OpenLDAP, the way ease of use is met is by users being able
>> to define a default index type or types.  That way, they can specify
>> the default set, and then just use index <attribute>, similar to what
>> is being done in Apache DS.
>>
>> I think it is important to allow specification of what indices to use
>> for a given attribute for a few reasons.  One, that you can use it to
>> actually make some searches slow enough to hinder efforts (like we
>> have a spam troller routinely trying to get data from our sources that
>> is fairly obnoxious), another is that the more indices you have on an
>> attribute, the larger the total database is, and the longer it takes
>> to load.  This of course depends on part in the OS/Cpu used as well.
>> For example, I currently index 90 attributes in my database to varying
>> degrees (most are eq, which is a fairly minimal index).
>
> Note that ApacheDS indices are very similar to OpenLDAP equality indices
> with some minor differences for handling substring matching.  The cost
> is about the same as an eq index.  So you get sub, eq, and existence for
> the price of eq.


Ah, okay.  That is handy.

>> On my Solaris sparc systems, it takes 2.5ish hours to load the database.
>>
>
> What kind of sparc is that?  I have a blade 2000 here if you want to try
> a current machine benchmark.  I could create an account for you to
> test.  Just let me know.


My current systems are SunFire V120's, with a 650 MHz CPU and 4GB of RAM. 
I've played with more modern systems (8 CPU T2000's with 4-cores), and they 
are definitely faster, but the same underlying issue of needing to use a 
memory cache during bulk loads still applies, but that may be an 
OpenLDAP/BDB specific thing that ApacheDS wouldn't encounter.


>> On my new AMD systems that'll be replacing the Sun Sparc boxes, it
>> takes all of 14.5 minutes.  However, if all 90 of those attributes
>> were getting indexed pres,eq,sub, the amount of time to load would
>> increase significantly.
>>
>> Currently, my indices take up 1.1GB of disk space in OpenLDAP (I'm not
>> sure how that exactly map out in Apache DS).  My database entry file
>> takes 2.7GB.  So my indices are approximately 1/3 of my database size.
>
> Yeah the cost of disk space is just about the same but that's the least
> of our worries.  Disk is cheap as Emmanuel stated.

Nods, memory was more my concern, but it may not apply in the ApacheDS 
case, given the use of a different database backend and the difference in 
how indices are done.

--Quanah

--
Quanah Gibson-Mount
Principal Software Developer
ITS/Shared Application Services
Stanford University
GnuPG Public Key: http://www.stanford.edu/~quanah/pgp.html

Mime
View raw message