directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Karasulu <>
Subject Re: [DISCUSSION] General API & SPI Concerns
Date Thu, 06 Jan 2011 18:16:46 GMT
On Thu, Jan 6, 2011 at 4:43 PM, Emmanuel Lecharny <>wrote:

SNIP ...

On 1/6/11 2:58 PM, Alex Karasulu wrote:
>> This is not to blame anyone. I am pointing out the problem, and pointing
>>> out
>>  a solution to it so we're not screwed by it. The web of dependencies in
>> shared will f**k us down the line if we don't nix em now.
> I'm wondering what would be the best way to get rid of those coupling...
> May be creating many maven modules (one per package) then we will
> immediately see the invalid coupling ? Or is there any tool we can use to
> detect the bad coupling ?
Yeah we could do a module per package but that might be too eager.

For now let's let dependencies that we cannot remove with some simple tricks
and some common sense about what coherently goes together loosely guide our
path. But let's be relaxed about it not freaking out about module explosion
but let's not explode needlessly.

I wish one simple equation solved these things but unfortunately they don't

> I must admit I have not investigated this area yet...
No worries. I got your back here and will give y'all an update about what
was done, how and why so we're on the same page at some point and can think
together on it to finally tidy up.

Just get this AP thing worked out without worrying yourself about these
details too much. What you're doing in AP land is much more important.

 The work needed here is a joke really. The big issue with it is the impact
>> the changes in shared have all over the place in Studio and ApacheDS and
>> the
>> fact that we're better off waiting for AP work to complete to merge.
> Absolutely. I know that I'm a bottleneck here, but OTOH, there is little I
> can do to move faster :/
Please don't feel rushed. Again what you're doing is one of the most nasty
areas of the server and not trivial. It would be easier and simpler writing
a web server than this region of code. So just focus on doing it right so it
does not steal any more of your time.

I'll work on this stuff and update the list. I've got some stupid things to
take care of today so I will not be as agressive until the weekend. Just a
heads up.

>   This might be due to eager reuse or the addition of
>>>> utility methods into codec classes for convenience. Some of these
>>>> dependencies can be removed by breaking out non-implementation specific
>>>> methods and constants in codec classes into utility methods outside of
>>>> the
>>>> package or the module all together. Furthermore the codec implementation
>>>> that handles [de]marshaling has to access package friendly (non-API)
>>>> methods
>>>> on implementation classes while encoding.
>>>>  Not sure that I get what you mean here. Can you be a bit more explicit
>>> ?
>> LdapEncoder accesses package friendly methods inside most message Impl
>> clases to encode them. This also pulls into message dependencies from
>> codec
>> which can be hidden. But these are really easy to fix. We just need to
>> know
>> that the situation is there and get rid of it.
> Get it now.
> Btw, I still have some issues with the codec classes
> (LdapEncoder/LdapDecoder). They could be simplified, as we still live with
> some mechanisms used years ago. The Client-API codec is way simpler.
> We can discuss this point in a separate thread.
Sure no problem. I have some ideas here too (nothing big) just to make it so
we can hide implementation better with the code making it more pluggable.
While your working let me test the ideas out and post something about it.

>   In the end, dependency upon further transitive dependencies are making us
>>>> expose almost all implementation classes in shared, and most can easily
>>>> be
>>>> decoupled and hidden. It's effectively making everything in shared come
>>>> together in one big heap exposing way more than we want to.
>>>>  It's quite impossible in Java to 'hide' all the classes that a user
>>> should
>>> not manipulate. Unless you use package protected classes, and it quickly
>>> has
>>> a limit, I would rather think in term of 'exposed' (ie documented) API.
>> OSGi bundles really helps in this respect. It fills in where Java left
>> off.
>> OSGi makes it so the (bundle) packaging coincides with module boundaries.
>> In
>> Java this is loose and there's leakage all over, as you say, it's very
>> hard
>> to hide all implementation classes.
> True. I ruled out OSGi, but that may help a lot.

Yeah but as Seelman pointed out you only get that benefit in the OSGi
environment. We can do more like break things up better and use this
internal package name component.

>  That this documented API is gathered in one separate module for
>>> convenience
>>> is another aspect, but the user will still have to depend on all the
>>> other
>>> modules.
>>>  Certainly, you're right, dependencies will still exist. A codec will be
>> depended upon for it's functionality even if we do hide the implementation
>> details under the hood.
>> The value add here is not from avoiding a dependency. It's from not
>> exposing
>> more than we have to and being able to hide the implementation. This way
>> we
>> can change the implementation at will across point releases without having
>> to bump up to a major revision.
> what is important here, as you say, is to avoid exposing things that the
> user does not have to manipulate. It's noise to him.
>>  LDAP Client API
>>>> ------------------------
>>>> Everyone agrees that this API is very important to get right with a 1.0.
>>>> Right now this API pulls in several public interfaces directly from
>>>> shared.
>>>> Those interfaces also pull in some implementation classes. The logical
>>>> API
>>>> extends into shared this way. Effectively the majority of shared is
>>>> exposed
>>>> by the client API. The client API does not end at it's jar boundary.
>>>> All this exposure increases the chances of API change when all
>>>> implementation details are wide open and part of the client API.  And
>>>> this
>>>> is what I'm trying to limit. There are ways we can decouple these
>>>> dependencies very nicely with a mixed bag of refactoring techniques
>>>> while
>>>> breaking up shared-ldap into lesser more coherent modules. The idea is
>>>> to
>>>> expose the bare minimum of only what we need to expose. Yes the shared
>>>> code
>>>> has become very stable over time but the most stability is in the
>>>> interfaces
>>>> and if we only expose these instead of implementation classes then we'll
>>>> have an awesome API that may remain 1.X for a while and not require
>>>> deprecations as new functionality is introduced.
>>>>  How will you limit the visibility of the modules you don't want the
>>> user to
>>> be exposed to ?
>>>  A combination of refactoring techniques will be used to be able to
>> better
>> use standard Java protection mechanisms to hide implementation details
>> combined with using OSGi bundles instead of Jars to only export those
>> packages that we do want users to see.
> Let's see what it brings. I have the feeling that discussing about pros and
> cons ad nauseam will bring less light than a simple experiment. Let's be
> darwinist in this area, weak solutions will perish by lack of merit.
Perfect !

>  This is extremely painful to do such a cleanup without first decoupling
>>> all
>>> the pieces by creating separate jars, before regrouping the packages back
>>> again.
>>>  Why bother regrouping? We can regroup things for convenience if people
>> want
>> a single jar without deps like the shard-all thingy.
>> We should not be uncomfortable having multiple modules to better decouple
>> this big hunk of code, and isolate coherent pieces as units.
> My perception was that in Studio, having tens of modules were painful. We
> added the shared-all module, but most of the case, I think this creation of
> zillions of modules is quite artificial. However, at some point, it could
> help to have dedicated modules.
> We already have a separation by using packages, the question is how to
> correctly split the big ball of mud in smaller but useful modules.
> For instance, to me, it makes sense to have a separate ASN1 module, for the
> sake of hiding this detail to the user. Really, who cares about ASN.1 ? Why
> do we have to expose the ASN1 classes in the ldap-api ? So this is a valid
> reason.

We don't need to yeah but it's nice to pull it out to break up dependencies
and hide implementation classes. But somehow I see many ASN.1 things often
being needed in the server even in higher levels.

> Another example of good separation is the DSML module. It's not part of the
> core LDAP api, and if I, as a user, don't do DSML, why should I be forced to
> include it in my dependencies? This is also a valid reason for having DSML
> be a separate module.

> OTOH, and we probably went to far back in september, we don't force the
> user to declare a dependence on either a big shared-all with many parts he
> is not interested in, or many dependences on many small jars.
> We have to find some balance here, and the suggested separation (in your
> first mail) is probably making a lot of sense (see below).

>  The question here is more to know how far we want to go, considering that
>>> shared contains 900 classes, more than 5600 methods and around 80
>>> packages.
>>>  Yep it's big but the problem here is not massive. It starts slowly
>> solving a
>> couple things and once you decouple a few things, decoupling others
>> becomes
>> much much easier and a layout to all of it starts falling out nicely,
>> which
>> shows even if we dumped here and created some cleanup issues for ourselves
>> the overall code really was written well.
> Tooling could help. I don't know which tool exactly, but this is an area we
> never really explored.
I use IDEA's stuff for refactoring and code analysis. IDEA does it much
better than eclipse. Then I switch to eclipse for regular coding. I might
just stay in IDEA from now on - digging IDEA 10 it's really fast.

>>       (2) Breaking up shared into multiple Maven modules so now there's
>>> the
>>>> following modules:
>>>>           o shared-util
>>>>           o asn1-api
>>>>           o asn1-ber
>>>>           o ldap-model
>>>>                  - name pkg
>>>>                  - message pkg (no impl classes)
>>>>                  - schema pkg
>>>>                  - cursor pkg
>>>>                  - filter pkg
>>>>                  - entry pkg
>>>>                  - constants pkg
>>>>           o ldap-codec (not complete)
>>>>  I would not have 2 maven modules for asn1. It's probably overkilling. I
>>> would rather name the ldap-model ldap-api, because this is exactly what
>>> it
>>> is.
>>>  There are reasons for this to be able to get the codec to be separable.
>> Once
>> you get in and play with the little non-important details you'll probably
>> come to the same conclusion yourself.
> In fact, we have ber and der codec. I'm not sure I want to expose that in
> LDAP. If we have used an asn1 compiler to generate the codecs, then yes, we
> would have had 3 modules : asn1-api, asn1-ber and asn1-der. Plus the
> generated codec.
> I don't know if it worths the effort here. Exposing a monolithic asn1
> module should not be a big issue. It won't change.
I'm going off how well I can break up dependencies here - thats why I
created asn1-api and asn1-ber (maybe should have called it asn1-impl). But
we have the option of consolidation later on once we can look at the
dependencies between modules when all the dependency cleanups are done.

I think then we'll have a better picture of what should be grouped together.
For now let's just cleanup as best as possible then see how the independent
blocks can be put together.

>  Let me propose a methodology we can follow here to speed things up without
>> needlessly arguing each point because in the end we do in fact come to the
>> same conclusions.
>> Let's just relax while decoupling about the number or name of modules. The
>> first pass should be about breaking up dependencies to hide the
>> implementation details so we're free to solve these aggressively without
>> inhibitions.
>> Then once we see a clear dependency between modules, we can take another
>> pass at consolidation as a separate concern and discussion. Until we see
>> the
>> real dependency picture fall out from refactoring it's moot over
>> discussing
>> it.
> Ok, I buy that. As I said earlier, discussion is good to have, but it's not
> as valid as action. We can move back and forth with the code as a base for a
> further discussion anyway.


>  They are helper classes. They certainly don't belongs to
>>> ldap-model/ldap-api, and if they have to stay in shared, I would like to
>>> move it to utils.
>>>  We discussed this last night. Just wanted to point out the clarification
>> you
>> made to me.
>> Shared will have shared-utils for generic utility classes that can be used
>> by anything not just LDAP code. Then there may be a shared-ldap-util but I
>> think this might be overkill and not such a good idea: let me explain
>> technically why:
>> If we dump these ldap specific utility classes into an ldap-util, then a
>> dependency to one util class pulls in utility classes in the rest of the
>> module increasing footprint perhaps needlessly.
>> What is needed for minimal generic client operation should be kept
>> together
>> with as little dependencies as possible. No need to expose the plethora of
>> utility classes we have amassed in there.
>> Yes they are very useful utilities but it's not about packaging freebees
>> it's about keeping things small, tight, and minimizing exposure. We can
>> package these things into a separate jar and only use them in studio and
>> in
>> apacheds.
>> So in conclusion what I am saying is the general formula we have become
>> accustomed to where we throw all utility classes into one module no longer
>> works blindly for us. We need to think about what needless dependencies
>> and
>> classes this is including.
> Here, again, we need action now. Enough discussion, let's move to code. We
> can then discuss the pros and cons later, and iterate.
Excellent - we're on the same base.

Alex Karasulu
My Blog ::
Apache Directory Server ::
Apache MINA ::
To set up a meeting with me:

View raw message