xml-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefano Mazzocchi <stef...@apache.org>
Subject Re: SAX
Date Sat, 13 May 2000 10:50:09 GMT
Michal Mosiewicz wrote:
> 
> Stefano Mazzocchi wrote:
> > [...]
> > You are clearly identifying a SAX producer as a parser or a XML adapter.
> >
> > If you think at a SAX producer as an XPointer implementation then you
> > ask for
> >
> >  file.xml#xpointer(/news/articles[@author='foo'])
> >
> > or even more powerful
> >
> >  file.xml#xql(whatever-XQL-will-look-like)
> >
> > and what is produces is exactly what you need as for XML random access.
> 
> Ok - say, there are those article documents. Each of them has some
> /article/title, /article/img, /article/intro, and /article/body. Then,
> you want to make an index page, so the most obvious solution would be to
> pass them through some transformation selecting non-body content. Note
> that this not-visible body content may be a larger part of the document.
> The transformation doesn't generate any event neither on /article/body
> element  nor it's subelements.
> 
> How could using XPointer or XQL help here? How would you prevent that
> the parser doesn't do any useless job of generating unnecessary events?

That's the key point! You think that who generates this data is a
parser. I never told you so.

When you want to store XML data and do a bunch of querying on top of it,
you need a read DBMS which is able to index it's XML content is such a
way that XPath or XQL queries are created _without_ the overhead of
parsing the entire XML structure.

It's exactly the same with RDBMS and primary key indexes.

> What is the scenario here to reduce computational cost of the
> transformation that is known to ommit a large portion of a source
> document?

The above, or, in case you have large number of documents and no XML DB,
pregeneration.
 
> Also, this argument is a schizophrenic for me. Once we all agree that
> active API like SAX could do better for us, but then we use passive API
> to proof that improvements of active API is not necessary, becouse the
> above XPointer syntax is nothing else, but getting back to the old
> passive API.

I don't understand this rant.
 
> > In the Cocoon project we did careful estimation of the requirement for
> > fragment caching and we agreed that it's much better to improve XInclude
> > functionalities and to cache entire documents, rather than having
> > document fragment caching.
> 
> Better than careful I like correct estimation. I cannot understand how
> XInclude would be much better improvement. I'm talking about improvement
> that is possible along the whole processing path - from content
> generators, that may be sometimes not required to generate full content,
> to translators, and finally to serializer, which is able to get the
> information about cacheable parts of the document and remember them in
> serialized form. XInclude can only be used to improve one side of the
> transformation (i.e. the producing side), and it requires that you have
> to operate on separate documents that may be included or not, but you
> cannot mark some content cacheable while passing it through some
> transformation.

I'm sorry but I don't understand the usefulness of this. Can you provide
an example?

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<stefano@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------
 Missed us in Orlando? Make it up with ApacheCON Europe in London!
------------------------- http://ApacheCon.Com ---------------------



Mime
View raw message