xml-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefano Mazzocchi <stef...@apache.org>
Subject Re: SAX
Date Thu, 11 May 2000 13:53:04 GMT
Michal Mosiewicz wrote:

Hi Michal, nice to see you around here :)
 
> I've sent it already to sax@megginson.com, but I thought that it might
> be worth to mention it here. I think that SAX could be much more robust,
> if it was extended a little bit, and this list could be a good place to
> discuss it.
> 
> First, what I notice in SAX that bothers me is that it lacks for
> bidirectional communication. I.e. there is a data producer, that sends
> the whole document as events, and there is a consumer, that is only
> expected to listen for those events, and just to follow them.

yes, that's the main design decision behind SAX and I must say I find it
perfect for its job.
 
> IMHO, SAX event handlers also should return some informative codes
> similiary to how it is implemented in Sun's taglibs, where tag handlers
> may return things like SKIP_BODY or SKIP_PAGE.

I don't think there is any need for this (read below)
 
> All the parts of the producer-transformer-serializer path could possibly
> benefit from this:
> 
> 1. Some content producers may send possibly more data that is needed to
> get the final document, so transformations applied may shorten the
> source content, or even get small parts of this. While this is maybe not
> so necessary in case of simple sequentially read files, it may enhance
> performance if the producer may ommit some parts - for example, if you
> wrap database data in your XML or more general - if you can access
> source document randomly.

You are clearly identifying a SAX producer as a parser or a XML adapter.

If you think at a SAX producer as an XPointer implementation then you
ask for

 file.xml#xpointer(/news/articles[@author='foo'])

or even more powerful

 file.xml#xql(whatever-XQL-will-look-like)

and what is produces is exactly what you need as for XML random access.
 
> 2. We can optimize transformers, especially if you pipeline several
> transformations, you can backtrace the events that don't produce any
> result, and optimise the whole transformation pipeline.

??? I don't get it. You spend time elaborating the event stream to mark
each of those who doesn't trigger the even generation, then you want to
reuse this information for other calls? This is useless since the
producer may generate other events and you have to do the same all over
again.

> 3. There is also potentially much larger gain in serializer part,
> becouse this could allow for structure level caching of the result, i.e.
> you could potentially decide to cache some fragments of the output, not
> necessarily whole documents. Currently it's teoretically possible, but
> document producer is not able to know if it is not required to provide
> some data, becouse the cached output is still valid.

In the Cocoon project we did careful estimation of the requirement for
fragment caching and we agreed that it's much better to improve XInclude
functionalities and to cache entire documents, rather than having
document fragment caching.
 
> Does anybody know if there is any mailing list related strictly to SAX?

it should be xml-dev but I don't remember where it is hosted now.
(anyone? Norman?)

Anyway, I don't see the need for what you ask.

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<stefano@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------
 Missed us in Orlando? Make it up with ApacheCON Europe in London!
------------------------- http://ApacheCon.Com ---------------------



Mime
View raw message