forrest-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Antonio Gallardo <agalla...@agssa.net>
Subject Re: [RT] A new Forrest implementation?
Date Mon, 21 Aug 2006 09:39:37 GMT
Ross Gardler escribió:
> Tim Williams wrote:
>> On 8/14/06, Ross Gardler <rgardler@apache.org> wrote:
>> minimal requirement: an XMLConsumer/XMLProducer (easy and natural, sax
>> event handlers and a single method respectively) and some simple
>> lifecycle contract methods needed for being a part of the managed
>> environment.  
>
> I really should have been talking about the complexitites of writing a 
> generator. As we very rarely need to write transformers. Try writing a 
> generator that, for example, uses hibernate to communicate with a 
> relational database.
A hibernate generator does not make sense at all. You should ask people 
using hibernate with cocoon. BTW, what have relational database in 
common with forrest? Are we discussing about cocoon or forrest goals? 
Please explain because it is not clear to me.
>
>> I think being in some sort of managed environment (e.g.
>> Spring) is likely needed in any real-world approach.  So I'd turn this
>> around and ask where is the complexity?
>
> First complexity: building Cocoon
>
> Second complexity: building any component that has additional 
> dependencies
>
> Third complexity: deploying a new (non-trivial) component within a plugin
>
> Fourth complexity: a community that is pulling in many different 
> directions
>
> There are many more but I will leave it at that. If you don't agree 
> then I suggest you actually try it before arguing the case. You can 
> then tell me where I am going wrong.
>
> Of course, it can be argued that 1-3 are because Forrest was built 
> against a much older version of Cocoon and has failed to keep up (for 
> example why a plugins not Cocoon blocks?). I would respond that this 
> is because of the fourth complexity.
>
> So, then it can be argued that we should be contributing to Cocoon and 
> helping resolve the fourth complexity. That may be the outcome of this 
> RT, it may not.
For the records, the cocoon community *recommended* all related projects 
(including forrest) to stay with 2.1 *until* 2.2 get stabilized. This is 
what lenya, hippo and daisy did and I did not see this kind of crisis 
there. Hence is this our own fault? Should be the solution claiming that 
cocoon is not for us or are there other reasons?
>
>
>>> Decide what output plugin to use
>>> --------------------------------
>>>
>>> This is done by examining the requested URL. The actual selection of 
>>> the
>>> output plugin is done within the Cocoon sitemap. I have all the same
>>> arguments here as I do for input plugins, this only needs to be a 
>>> simple
>>> lookup, not a complex pipeline operation.
>>
>>
>> I get the feeling you're basing this on the simplest use-case
>> imaginable.  The output plugin is about the format of the output not
>> the content of the output.  The sitemap benefits here allow for more
>> complex processing (e.g. user profiling, smart content delivery, etc.)
>
> I disagree. The sitemap is a way of *configuring* this complex 
> processing, it is not the processing itself. The sitemap has become an 
> XML programming language and I hate it for that reason.
>
> Have you ever dived in to the implementation and tried to do anything 
> useful in there?
Why is this so important? Is this forrest business?
>
> The fact that the sitemap had become a programming language is one 
> reason why Cocoon came up with the flow engine (e.g. to get rid of 
> actions). But if you use the flow engine then you are programming with 
> Javascript, it's only a small step from there to Java. So are there 
> any benefits in using Javascript over Java?
You should know javascript is not java. It is not even a closer language 
to java. Netscape used this name just because java was a buzzword at 
that time. Hence, there is not just a small *jump* to java. BTW, 
javascript is a programming language that is supposed to be used by HTML 
developers right?
>
> In my opinion the answer is a resounding no, at least for our use case.
>
>>> Generate the output format
>>> --------------------------
>>>
>>> This is typically done by an XSLT transformation and/or by a third 
>>> party
>>> library (i.e. FOP) I have the same arguments here as I do for the
>>> generation of internal format documents, in fact the parts of Cocoon we
>>> use are identical in both cases.
>>
>>
>> Yeah, output is just a transformer.  Same thoughts as above.
>
> OK, back to aggregation since I argued earlier that it belongs here.
>
> Aggregation is nothing more than the collation of a number of 
> resources in response to a single request. It turns a single request 
> to a number of requests. Each individual request is handled just like 
> any other request. ASo what you have is a locationmap something like 
> this:
>
> <map match="foo/bar/**">
>   <aggregate>
>     <location src="..." required="true"/>
>     <location src="..." required="false"/>
>   </aggegate>
> <map>
>
>>> Caching
>>> -------
>>>
>>> Cocoons Caching mechanism is pretty good, but it has its limitations
>>> within Forrest. In particular, we have discovered that the Locationmap
>>> cannot be cached efficiently using the Cocoon mechanisms.
>>
>>
>> This may be true. We had a novice working on LM caching at the time
>> and I've learned quite a bit since then.  I'd like to re-evaluate this
>> before I'm willing to agree with with such a bold statement.
>
> This illustrates my point exactly. I looked at this too and also 
> failed to get a better solution.
>
> The reason I failed (and I guess the same for you) is that the code is 
> just so complex and jumbled that it's next to impossible to find ones 
> way around once one gets past the API.
Why you did not asked for help on the cocoon list. I am sure there are 
some people willing to help. Just because we were unable to implement a 
better cache mechanism, it does not mean something is complex.
>
>>> This is now
>>> one of the key bottlenecks in Forrest.
>>
>>
>> Based on?  I'd like to see this profiling data.  Knowing that the LM
>> is our way ahead I've been worried about squeezing every ounce where
>> we could but I was still under the impression that it isn't a
>> consequential performance bottleneck.
>
> Try building the Cocoon docs. Its set up on a Forrestbot in our zone. 
> Even when co-located on the same physical machine as the source for 
> the content it takes over 30 minutes to build. It really is a horrible 
> solution.
>
> If you want to profile it then you can get the forrest site from the 
> Cocoon-Whiteboard.
>
> This is an extreme example case, but one that is quite common in my 
> experience using Forrest to do real document processing (as opposed to 
> web site generation).
>
>>> We could work with Cocoon on their caching mechanism but there seems
>>> little interest in this since our use case here is quite unusual. Of
>>> course, we can do the work ourselves and add it to Cocoon. But why not
>>> use a cacheting mechanism more suited to our needs?
>>
>>
>> So it's not 100% suitable so it's worthless?  It fits in 98% percent
>> of our needs so I don't see this as a compelling argument.
>
> That's unfair. I'm saying it is not perfect, therefore it is not 
> necessary to use it. I did not say it is not perfect so lets get rid 
> of it. Please take this in the context of all the other problems I am 
> highlighing rather than considering it as a single point.
>
> Besides it doesn't work for the locationmap, so in fact it is not used 
> in some of the processing of every single request we make. That's 
> considerably more than "2%"
>
>>> Ready Made Transformations
>>> --------------------------
>>>
>
> ...
>
>> You seem to be
>> suggesting that Cocoon requires some big overhead to do transforms and
>> that's simply not the case.
>
> That's right, I call 40Mb of bloat a fair big overhead for doing XSLT 
> transformations.
>
> This time I really am oversimplifying, but I hope you see my point - 
> certainly that is how my customers see it. As a result I ended up, in 
> most cases, writing a series of Java components that I wired together 
> manually and plugged directly into whatever framework they were using. 
> This RT is about doing this in a more felxible and reusable way.
Yes, it is really over simplistic, because claiming that we use 40MB in 
cocoon just for XSLT transformations is like telling we should close 
forrest, because it is what forrest do too or perhaps we should 
distribute a couple of XSLT files instead, right? I am taking into 
account that XSLT transformations support is already built into java 
1.5. BTW, the cocoon kernel is only 1.4 MB. :-)
>
>>> This complexity makes it difficult for newcomers to get started in 
>>> using
>>> Forrest for anything other than basic XSLT transformations.
>>
>
> ...
>
>>  My point is that newcomers are
>> going to find it difficult to deal with any framework that attempts to
>> achieve anything beyond the simplistic.
>
> Yes, but if the framework is designed to do one job (publishing in our 
> case) then it is simpler to understand than if it is designed to do 
> every job (as with Cocoon).
There is no need to use every cocoon block.
>
>>> The end result is that we have only one type of user - those doing XSLT
>>> transformations.
>>>
>>> Plugin Selection
>>> ----------------
>>>
>>> This is done through the sitemap. This is perhaps where the biggest
>>> advantage of Cocoon in our context can be found. The sitemap is a 
>>> really
>>> flexible way of describing a processing path.
>>>
>>> However, it also provides loads of stuff we simply don't need when all
>>> we are doing is transforming from one document structure to another. 
>>> This
>>> makes it complex to new users (although having our own sitemap
>>> documentation would help here).
>>>
>>> Finally, as discussed in the previous section, we don't need a complex
>>> pipeline definition for our processing, we just need to plug an input
>>> plugin to an output plugin via our internal format and that is it. We
>>> have no need for all the sitemap bells and whistles.
>>
>>
>> I'm struggling to figure out what you think is forcing us into our
>> current apparently overly complex solution.  Is it the sitemap grammar
>> that is complex?  
>
> Not the grammar itself (although I do hate the fact that we are now 
> programming using the sitemap). The complexity is in processing of 
> that gramar whic results in the selection of the processing path to take.
>
> All we need to do is select the right plugins and make them work 
> together. Look at how many internal pipeline requests there are to do 
> this in Forrest now (its even worse if we use the dispatcher).
>
> This is overly complex for what is ultimately a couple of lookups.
>
>> Learning curves aside, I'd rather sit on top of a framework that
>> supports a more complex solution than is my current problem because
>> experience has shown me that the initial requirements grow and I don't
>> want to have port when that growth happens.
>
> This is exactly why I hate "catch all" frameworks. They try to be all 
> things to all people. I prefer to use what I need now and look at 
> expanding things when I find a use case that requires it. How can you 
> know in advance that the framework you choose is going to be adequate 
> for the job in hand? How do you know you won't eed Struts, or Ruby On 
> Rails, or Wicket or SpringMVC or whatever?
>
> This is personal opinion and we should really leave it at the door. 
> Different people for different things. Our job is to decide what is 
> best for the project not for us as idividuals. I'll just leave you 
> with one though...
>
> If I'm going hiking I do not struggle carrying a family tent on my 
> back just because I may have some more children at some point in the 
> future.
Sure, and this is why you should use only the cocoon blocks that are 
needed for your work.
>
>>> Conclusion
>>> ----------
>>>
>>> Cocoon does not, IMHO, bring enough benefits to outweigh the 
>>> overhead of
>>> using it.
>>>
>>> That overhead is:
>>>
>>> - bloat (all those jars we don't need)
>>
>>
>> this is going to be addressed with maven (argghhh) and/or osgi someday
>> - it's a recognized issue by many cocooners.
>
> "someday" is the optimal word there. I've been waiting too long.
>
> If we reject this RT based on this argument then I want to see Forrest 
> developers helping Cocoon sort this out rather than standing by 
> waiting for it to happen.
As I told initially, there was a bad decision to start using 2.2 too 
early. And this is the cause of all this problems. I think it is not a 
cocoon fault after all, as I told, the cocoon community gave and advise 
that was reject by forrest. It is our own "mea culpa".
>
>>> - complex code (think of your first attempt to write a transformer)
>>
>>
>> I've never written a transformer.  I suspect that I could do it in a
>> day or less though depending upon the requirements.  It's simply
>> implementing XMLConsumer by handling SAX events, not that
>> extraordinary for a SAX-stream-based framework.  How do the many other
>> pipeline frameworks do transforms if not by handling SAX events?
>
> Yes, transformers are simple. I should have picked non-trivial 
> generators as discussed above. Especially since this is a more common 
> requirement in the real world. That is we need input plugins to 
> inteface with existing corporate legacy code.
>
>>> - complex configuration (sitemap, locationmap, xconf)
>>
>>
>> Like component managers nowadays, we've failed to strike a good
>> balance between flexibility (configurability) and ease of use.
>
> I really can't agree with the "like component managers nowadays" part. 
> Have you actually worked with something like Spring? It is 
> unbelievably simple.
AFAIK, cocoon is migrating to spring.
>
>>> - based on Avalon which is pretty much dead as a project
>>
>>
>> They are at least partially migrated to Spring for management
>> purposes.   I understood that as a move to eventually migrate fully
>> from Avalon to Spring.
>
> Don't be fooled by the "headlines". Look into the code. Until the 
> Avalon jars are gone then my point stands. Until someone here gets 
> into the Cocoon code and starts trying to disentangle things then my 
> point stands.
The avalon jars will be there for long time, the reason is because 
cocoon should stay backward compatible with all the components created 
in the lat years for our user base. Hence the avalon jar distribution is 
not a measure of your claiming.
>
> Why don't I do that? I have other things to do, I need Forrest to be 
> useful, I don't use, and have never used, Cocoon independantly of 
> Forrest (at least not commercially).
My case is rather the opposite. And I feel comfortable with cocoon. :-)

Best Regards,

Antonio Gallardo


Mime
View raw message