directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Emmanuel Lecharny <elecha...@gmail.com>
Subject Re: Various questions
Date Tue, 06 Jun 2006 07:37:41 GMT
Hi guys !

Quanah Gibson-Mount a écrit :

> I think it is important to allow specification of what indices to use 
> for a given attribute for a few reasons.  One, that you can use it to 
> actually make some searches slow enough to hinder efforts (like we 
> have a spam troller routinely trying to get data from our sources that 
> is fairly obnoxious),

In my mind, it's pretty much a security issue. You can add an 
authentication to avoid such behavior, or, if your data are public, then 
you have no reason to slow down the searches. Limiting the number of 
results may be more efficient. Btw, this is a real problem for a server, 
and something we sqhoudl consider : how to avoid DOS on a LDAP server 
(either by flooding, or with malformed requests, or with huge data). We 
still have to address those attacks. At this point, I may have a 
question : is it frequent usage for Ldap server to be exposed outside a 
company? Generally speaking, I never saw that. User data are really 
supposed to be private and not accessible from unidentified user. I may 
be totally wrong, but if I see a Ldap Server exposed to the world - 
never saw that for years -, the first thing I would ask the Admins is to 
close the door of their system. Just my opinion.

> another is that the more indices you have on an attribute, the larger 
> the total database is, and the longer it takes to load.  This of 
> course depends on part in the OS/Cpu used as well.  For example, I 
> currently index 90 attributes in my database to varying degrees (most 
> are eq, which is a fairly minimal index).  On my Solaris sparc 
> systems, it takes 2.5ish hours to load the database.  On my new AMD 
> systems that'll be replacing the Sun Sparc boxes, it takes all of 14.5 
> minutes.  However, if all 90 of those attributes were getting indexed 
> pres,eq,sub, the amount of time to load would increase significantly.

well, in production, loading a server ris not something you do very 
often. You may need to restore a crashed database, or reload a database 
which structure has change, but this is definitively not a real concern. 
Load once, use many.

>
> Currently, my indices take up 1.1GB of disk space in OpenLDAP (I'm not 
> sure how that exactly map out in Apache DS).  My database entry file 
> takes 2.7GB.  So my indices are approximately 1/3 of my database size.

3Gb is really nothing. A 15K Rpm SCSI disk is now 36 Gb minimum and cost 
aroung 200$. Not a big deal. Better spend money of memory sticks rather 
that on high performance disks :)

I don't want to say that making it possible to select indices is *bad*, 
but, IMHO, this may be a cool feature that is a little bit overkilling, 
when you balance it with real usages. For real RDBMS, having twice the 
size on disk for indices is considered plain normal. I don't think we 
should go that far, but when you choose to set indices on  an attribute, 
this may not be very important to offer a choice on which kind of 
indices you want.

At least, this is something we may consider for ADS, in a future 
version, but right now, we think this is not on top of our TODO list, 
regarding other huge issues we have :)

Thanks a lot for your feedback !

EMmanuel

Mime
View raw message