cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefano Mazzocchi <stef...@apache.org>
Subject Re: [RT] Converging the repository concept in cocoon
Date Tue, 02 Dec 2003 16:34:16 GMT

On 30 Nov 2003, at 22:04, Unico Hommes wrote:

>
>
> Stefano Mazzocchi wrote:
>>
>>
>> I'm working on Doco and I finished my first phase: I have a
>> repository
>> that I like and does what I need. It's Slide, in case you haven't
>> noticed ;-)
>>
>
> I did :-) I must say I am very happy you are doing this. I was involved
> with Slide some time ago but got frustrated with it because there were
> no decent store implementations for it. I must still have an X-Hive DB
> store implementation lying around somewhere. (B.t.w. I got the
> impression that there were fundamental flaws in Slide's design that
> would make it impossible to create a performant store implementation. I
> thoroughly hope these issues have been taken care of now.)

No, there are design issues, but we'll deal with those later on.

>> So, now I have a WebDAV/DeltaV/DASL/ACL repository and I want to
>> connect to it.
>>
>> There has been a lot of work in the area of "Repository API" lately,
>> both inside and outside cocoon.
>>
>> Cocoon currently hosts four different repositories concepts:
>>
>>   1) two in the linotype block
>>   2) one in the slide block
>>   3) one in the repository block (which is a refactoring of the
>> SourceRepository in linotype)
>>
>> the linotype repository is a big time hack: it does what linotype
>> needed, but it's not reusable outside (concerns overlap in its
>> interface). The SourceRepository is an implementation of the linotype
>> Repository over a source instead that over a file system.
>> While nicer,
>> it inherits all the problems of the original interface. It does
>> versioning but it doesn't do properties or property querying.
>>
>
> Yep, which is exactly where you can see the whole concept starting to
> break.
>
>> the repository in the slide block uses slide directly and,
>> mostly, for
>> authentication purposes... it's based on an older version of slide,
>
> Hmm, really? I thought not much has changed to Slide's API since 1.16.

Uh, I looked again and you are right. Still, I don't like this since it 
prevents me from decoupling cocoon from the repository.

>> doesn't handle versioning, doesn't handle file properties. It's based
>> on actions, generators and transformers. To me, looks old and
>> the need
>> to have the repository on the local machine (and keep it
>> opaque to the
>> outside world) makes it impossible to use in what I need.
>>
>
> I had already been thinking of resurecting some of the stuff given your
> recent activity on the slide dev. I've been lurking on that list for
> more than two years and can definitely confirm that Slide was 
> previously
> dead compared to recent activity.

Hmmm, not sure it's worth the effort... I think contract to webdav is 
much more future compatible than a contract to the slide API... also 
because we might change that contract to use JCR when it's ready.

>> the one in the repository block is the cleanest one, but IMO, its
>> design is backwards. I'll explain what I mean in a second.
>>
>> For now, I think it's a must that, just as we did with forms,
>> we take a
>> look at the various approaches and choose one to follow and
>> ignore the
>> other ones.
>>
>
> Total definate +1.

Good

>
>> I think the repository block is the best effort, but it needs
>> substantial redesign.
>>
>>                                           - o -
>>
>> First of all, let me introduce what I mean with a "repository".
>>
>> A repository is a place where I store my content.
>>
>> Functionality I need is:
>>
>>   1) open/save document
>>   2) create collection of documents
>>   3) attach metadata to documents (externally to them!!)
>>   4) query the repository against document metadata
>>   5) versioning (autoversioning on saving and version update)
>>
>> how all these functionalities are implemented should *NOT* be my
>> concern, nor I want it to be when I'm using the repository.
>>
>> The linotype repository uses this design, while the one in the
>> repository block does not.
>>
>> Why not? well, it's fully based on sources and tries to obtain the
>> above functionalities from the source abstractions. This
>> means that the
>> contract is not on the API but on the source URL.... but this also
>> means that we cannot fully separate concerns since it's the driver of
>> the repository who chooses which source the repository needs to write
>> on.
>>
>
> This is not entirely true. If you take a look at the SourceRepository
> interface you will see no reference to Source whatsoever. All public
> methods only deal with Strings. I guess this means its naming is really
> bad. I was trying to avoid name clash with the slide block Repository.
> Of course, if you are talking about SourceRepository's only
> implementation you are entirely right ;-)

Yes, I looked at the implementation and I see your point. I'll follow 
up with a discussion on the right interfaces for a Repository class.

>> I strongly dislike this design because I think it got it all
>> backwards:
>> it should be the Repository to implement Source and give
>> source access
>> to those components who want to access content (say a
>> FileGenerator or
>> even a TraxTransformer)
>>
>
> I can only agree to this full hartedly. Thanks for speaking your
> opinion!
>
>> I looked into the repository block and I find a *lot* of things
>> (locking, permissions, properties) that look very much like a
>> duplication of effort.
>
> Historically, this is exacly how it happenened. All these interfaces
> were "designed" in the slide block. We've only migrated them from there
> to the repository block looking for an opertunity like this to discuss
> just what the hell we are going to do with them.
>
>> The Slide project spent years optimizing and
>> polishing issues like transactionality and locking, do you
>> really want
>> to implement a layer to "emulate" those things in case the
>> given source
>> is not capable of handling it itself?
>>
>
> You talking to me? ;-) definitely no! If slide (for the moment still
> quite reserved about that if you don't mind), JCR, whatever can provide
> that for me, I'm game!

:-)

>> I think a much better approach would be to come up with a
>>
>>   Repository.java
>>
>> interface and a few implementations that I can choose when I install
>> cocoon. This implementation would also implement Source.java and
>> provide its functionality thru a URL protocol.
>>
>> This allows:
>>
>>   - clear separation of concerns: cocoon should *NOT* be doing
>> repository stuff, which is already big and complex enough
>>
>>   - complete IoC: you choose the implementation and the
>> implementation
>> decides what to do and how to do it. Your contract remains the same
>> (thru the source-provided URL protocol and thru the component
>> interface)
>>
>>   - transparent polymorphism: you can have different
>> implementations of
>> a repository... file system, webdav, CVS, JCR, ... without having to
>> change any code in your application
>>
>> Thoughts?
>
> Well, yeah. I thought JCR was supposed to be this "Repository.java"? 
> Why
> not just use that? Do we really need another layer?

I think so, yes. JCR is incredibly powerful, but exactly because of 
this power, it feels a little "low level". JCR is sort of a virtual 
hypergranular file system with multidimensions. Think of it as a 
persistent DOM with enhanced serializing and query functionalities.

I think you will always need a sort of "application oriented API" on 
top of JCR... just like you need business objects on top of a 
relational database.

So, JCR is a sort of "JDBC for hierarchical databases". You could use 
that directly, sure, no problem, but you end up with the same troubles 
that you do with using JDBC directly.

This is why I think we need a higher level "repository" API that is 
*much* easier for people to learn and use, gives immediate 
gratification against the use of a relational database or a custom file 
system approach and solves 80% of the content storage needs.

For that remaining 20%, you will need to connect to JCR directly, but 
that's another story and, for now, JCR is not even there so...

--
Stefano.

Mime
View raw message