directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ole Ersoy <ole_er...@yahoo.com>
Subject Re: Streaming / Serializing Big Objects
Date Sat, 09 Sep 2006 17:58:49 GMT

Emmanuel:

Each experimentation is warmly welcomed, if they can
bring some 
solution 
to a known problem ! 

Ole:

Word - I'll consolidate these discussions into some
type of design document, so that we have standardized
terminology / architecture use cases / and code
examples to work from.

I volunteered for the "How to help documentation,
etc." JIRA entry anyways, so that will be a good
start.



--- Emmanuel Lecharny <elecharny@gmail.com> wrote:

> Ole Ersoy a écrit :
> 
> ><snip/>
> >
> >Just a quick terminology clarification - when I say
> >cache I mean in memory representations and when I
> say
> >persisted I mean written to disk.
> >
> >By directory tree I mean all the information that
> ADS
> >is intended to provide, regardless of precisely how
> it
> >is persisted or managed.  So I think we are on the
> >same page here.
> >  
> >
> Sure !
> 
> >So if all the information were in a dom like tree,
> >then something like EMF OCL could be used to query
> it.
> >  
> >
> Yep, but this is not exactly the way infos are
> stored. As we need to do 
> transversal retrievements (like search for every
> entries which name 
> start with 'ACME*'), using a DMO tree to represent
> data would be 
> particulary ineficient. This is not the same story
> if we were to dump 
> the content of the database, but, even then, the
> best solution is to use 
> a standard representation like LDIF or DSML
> (eerrrkkk).
> 
> >This may take up more of a memory footprint, or the
> >queries could be slower, but what if it's just as
> fast
> >or faster.  Then ADS would all of a sudden have a
> lot
> >more developers working on one of its building
> blocks.
> >  
> >
> Well, hmmm, what I can say from experience is that
> manipulating a XML 
> document is really slower than any other textual
> representation, by an 
> order of magnitude. Beware, I'm not saying that XML
> is bad by essence, 
> but just that you should use the correct texhnology
> to address every 
> problem. <OT> : to send data to another human, I'm
> pretty confident that 
> ASN.1 PER encoded is quite a way to gain interest
> from the NSA, who can 
> think that I'm sending crypted data ;). XML is then
> much better. ). And 
> I'm not really convinced that the technology used to
> build a Ldap Server 
> can increase the number of developpers. However, I
> can be totally wrong 
> :), but using Ajax, AOP, Rest and Hibernate may have
> some advantage, 
> because of the buzz around those technos, but I'm
> don't really see how 
> they can help implementing correctly and efficiently
> the basic operation 
> we need to have. I prefer to dedicate a lot of time
> on correct algorithm 
> and design, because this is essential. IMHO, of
> course ! </OT>
> 
> > <snip/>
> >
> >Yeah!  Lets go with the GMail one!!!! :-)
> >  
> >
> eh eh... We had fun last night with Alex discussing
> how we can use this 
> distributed resource for free :)
> 
> >So I think we are thinking pretty much the same
> thing
> >here, and that's what the StateManager would do.
> >  
> >
> well, hmm, StateManager don't mean a lot to me :(
> Sorry, man...
> 
> >It could even be pluggable, so for instance
> different
> >state managers for different peristance mechanisms.
> >
> >In the end we are just reading and writing data,
> and
> >that's the job of the StateManager.
> >
> >Whether it reads it all at once, a little here or a
> >little there, is up to it.
> >
> >If a telecommunications company is using ADS that
> want
> >lightning fast queries, then they probably would
> love
> >to see ADS restored and run from a single file that
> is
> >inmemory for all queries.
> >  
> >
> Ahhh... May be you are calling 'StateManager' what
> we call 'Partition'. 
> Partition, for us, is a 'location' where some data
> are stored, with a 
> common root. We may have different backend, and
> different startegy to 
> store data. So, here, StateManager = Partition. Am I
> right ?
> 
> >But if it's a authentication service where queries
> can
> >take there own sweet time, then maybe the IT dept
> >would rather just get 1 server with a gigantic
> drive
> >and have ADS query a persistant data source when it
> >needs stuff, and nothing is cached in memory.
> >  
> >
> I think it's up to the service who implement ADS to
> determine if they 
> want a in-memory partition or not. Just imagine a
> client-oriented Ldap - 
> I worked for a client who swas storing its 70 000
> 000 users in a Ldap 
> server -, then in-memory was out of a question, and
> cache was just 
> useless and even cost more than what we can gain.
> So, basically, yes, 
> you are perfectly right.
> 
> >So I think we are thinking the same thing, the only
> >question is what is the best solution that
> minimizes
> >the in memory foot print, regardless of the size of
> >the cache, maximizes maintenance ease and feature
> >development / modularity.
> >  
> >
> Yes, I think that when you think we are thinking the
> same thing is right :)
> 
> We have had this kind of discussion (best soltuion,
> etc) a few times. I 
> have a perception of what kind of Ldap usage we can
> meet in the real world :
> - small Ldap database: typically, small companies,
> or application who 
> use Ldap to manage a limited number of users : up to
> a few thousand entries
> - medium Ldap database : medium to large company who
> use Ldap as the IM 
> node. Around a few hendred of thousand entries
> - client-centric Ldap database : very large Ldap
> Dabatase used to store 
> information about the clients (like
> whitePages/yellowPages). Could store 
> hundred of millions entries.
> - application centric Ldap database : database with
> a lot of relations, 
> typically used by complex applications.
> 
> All those kind of Ldap usages - and not limited to
> this small list - 
> deserve a specific customization. Realibale
> database, in-memory 
> database, fast disk storage, huge clusters of disks,
> etc... All those 
> kind of possibilties are to be addressed. But, well,
> we are hardly to 
> 1.0-RC4 version :) It let a *hell* of opportunities
> to add all of those 
> stuff, starting right now :)
> 
> ><snip/>
> >
> >Yeah - lets just call them things...that need to be
> as
> >fast as possible to read and as fast as possible to
> >write.
> >
> >Ofcoarse easy of development and maintenance should
> be
> >considered vs. the speed considerations.
> >  
> >
> That's a very valid concern. I won't buy a 20%
> improvment if that mean a 
> bloated code... Of course, if we are talking of an
> order of magnitude in 
> 
=== message truncated ===


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

Mime
View raw message