cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sylvain Wallez <>
Subject Re: XMLByteStreamInterpreter
Date Fri, 18 Jan 2002 17:29:54 GMT
Carsten Ziegeler wrote:

> Torsten Curdt wrote:
>>While using the XMLByteStreamCompiler and XMLByteStreamInterpreter
>>I found that the Compiler will store all SAXevent just fine. No matter
>>if it's a valid DocumentFragment of just a fragment of a XML stream.
>>But the interpreter assumes that there is always a enclosing
>>startDocument/endDocument event and will otherwise only stop
>>with an SAXException "reached end of input".
>>Since the compiler supports it I'd like to change the interpreter
>>not to throw an exception but end normaly on the end of input.
>>AFAICS this should have no impact on Cocoon itself.
>>(Carsten, I know the caching system makes extensive use of the
>>XMLBytesStreamInterpreter. Can you see any problem with that?)
> No. The input taken from the compiler is after such a change still
> usable with the interpreter. As in the case of caching whole
> documents are compiled, this should be now problem.
>>If noone objects I'd like to change the interpreter so we have
>>the compiler to really record the events and the interpreter to
>>recall the events as recorded. (Right now I can record something
>>that I cannot recall with the interpreter without getting this
>>If we really want to observe what gets recorded and recalled I propose to
>>change the interpreter as mentioned and create two new classes 
>>extending the simple
>>compiler and interpreter that will make sure only DocumentFragments will
>>be recorded and can be recalled.
>>What do you guys think?
> On the one hand the usage of both components gets more flexible. You
> can compile and interprete arbitrary nodes of XML.
> But on the other hand we loose validity checking of the interpreted
> byte stream. Currently the byte stream must contain a valid XML document.
> With your proposed change, the byte stream can contain any block of
> XML and I think it is very hard to check if all opened elements are 
> closed (or if for each startElement event an endElement event is send).

I have a "WellFormednessCheckerPipe" on my TODO list for our projects. 
It's an XMLPipe that - as it names implies - checks that all elements 
are well balanced, namespaces are properly defined, etc.

This is something that could be placed in front of XMLByteStreamInterpreter.

> Another problem I see is that the interpreter is an XMLProducer
> and I think that the contract for an XMLProducer is to stream
> a whole document. So we shouldn't break that contract.

There's XMLByteStreamFragment just for that. Currently, it pipes the 
output of XMLByteStreamInterpreter through an EmbeddedXMLPipe that 
strips out start/endDocument().

> So it seems better to enfore that the compiler only compiles complete
> documents. 

This is limiting : the compiler is very usefull to buffer some content, 
be it a document or a fragment. It avoids the overhead of DOM when you 
just want to hold the content but not look at what's inside. See for 
example the "capture" logicsheet.

> Another possibility is to explicitly add methods to the XMLSerializer
> and XMLDeserializer which tell that not a whole document is processed
> but only a fragment.

We can also consider that the choice between document and fragment 
depends on the context where the data is deserialized, but isn't know 
when XML is serialized.

We could then say :
- XMLDeserializer is for documents (as it extends XMLProducer) and 
*always* calls start/endDocument(),
- XMLByteStreamFragment is for fragments and *never* calls 

In that case, the compiler doesn't need to store start/endDocument 
events, because this is determined by the deserialization context.

What do you think ?

Sylvain Wallez
Anyware Technologies -

To unsubscribe, e-mail:
For additional commands, email:

View raw message