cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefano Mazzocchi <stef...@apache.org>
Subject Re: [Bug] Default-reader depends on JTidy!!!
Date Thu, 08 May 2003 20:26:47 GMT
on 5/8/03 9:23 AM Bruno Dumon wrote:

> On Thu, 2003-05-08 at 15:26, Carsten Ziegeler wrote:
> 
>>Bruno Dumon wrote:
>>
>>>Hmm, for the xmlizer, it's still not like I expected it to be.
>>>
>>>The dependency moved out of the core, but the xmlizer will still be used
>>>transparently (if the html block is enabled). Would you mind if it is
>>>removed completely?
>>>
>>
>>Yes :)
>>
>>Now, the "trick" is that the cocoon provides a toSAX method in the
>>cocoon source resolver interface. This method makes internally the
>>test against XMLizable and - if the object is not XMLizable - uses
>>the XMLizer component.
>>
>>I think this is a very usefull feature, because whatever Source object
>>you have, you always get SAX events (ok, this is of course only
>>true for the supported mime-types). 
>>And I personally really like this transparent handling.
>>
>>But I see the point, so one solution is to not use the above mentioned
>>toSAX() method whenever the transparent handling is not wanted.
>>
>>If, however, I'm the only one who thinks that the use of the xmlizer
>>is in some situations usefull, we could discuss to remove it completly.
>>But I personally doubt that I'm the only one :)
> 
> 
> Well, I don't know anymore. Now that I know about this behaviour, it
> doesn't bother me that much anymore. But for people who don't know about
> it, it will still cause confusion.
> 
> We should at least document this very well in the FileGenerator, and add
> a parameter to the FileGenerator to disable this behaviour. And maybe
> make it disabled by default, since you would probably be using the
> HTMLGenerator anyway if you want to parse HTML.
> 
> But lets hear if others have anything to say about this...

I really don't see Carsten's point.

We designed the cocoon generator concept *exactly* to adapt streams to
SAX. Since you *know* what kind of stream you are connecting to, you
know what generator to use to adapt things.

Having a transparent adaptation phase in the source that is not even
close to be as granularly controllable as our generation approach seems
wrong to me.

an transparent XMLizer is cool for the Excalibur component that doesn't
have the entire cocoon machinery around it, but here, well, it's not
only redundant, but potentially harmful because it does things
implicitly without a direct way to control it.

I personally see it as completely against the spirit of the sitemap
declarativity.

Therefore, I vote +1 to remove it entirely: generators receive streams
of bytes and sent streams of SAX events.

In fact, the source trying to do too much as been bugging me for a long
time: people look at the FileGenerator and this has became to thin that
they don't even know what's going on. This is, IMO, hurting the ability
for people to write their own generators and this is hurting us as a whole.

Too much cocoon machinery has been moved into the sourceresolving and
xmlutil avalon packages.

I would like to see it coming back so that we, the cocoon community, can
control it better by seeing what's going on without forcing people to
subscribe to avalon-cvs

how? one step at a time. For now, for example, give back to Generators
their status and keep sources for what they were designed to be: stream
locators that have nothing to do with the stream they connect to.

I know this sounds rather radical, but we must take control back: source
resolving and xml utilities are *way too* fundamental for us.

Please, Carsten, don't get me wrong: I'm not saying that it's your
fault, but it's a simple fact that source resolving and xmlutilities are
used in cocoon only (sorry, Peter, you count as part of Cocoon). The
fact that they are used here, but developped there removes our ability
to oversight it and stop changes that we dislike as soon as they happen.

Subscribe to avalon-cvs, you say? sure, but the signal/noise ration for
a cocoon developer is simply too much.

So, maybe not right now, but in the future, I would like to see those
packages coming back home and the part of cocoon that deals with those
packages be less factored-out in favor of easier understanding of the
cocoon internals and reduction of contracts.

comments?

-- 
Stefano.



Mime
View raw message