cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jimmy Zhang" <crack...@comcast.net>
Subject Re: [ANN] VTD-XML Version 1.5 Released
Date Mon, 20 Feb 2006 19:53:02 GMT
Random access is defined by DOM, like navigating from an element
to one of child elements, or one of attributes...
Interesting that you sounds quite negative... if I were you, I would
reserve any judgement and try to understand VTD-XML as well
as I can, before making claims that it cuts corners...
Cheers,
Jz

----- Original Message ----- 
From: "Stefano Mazzocchi" <stefano@apache.org>
To: <dev@cocoon.apache.org>
Sent: Monday, February 20, 2006 11:21 AM
Subject: Re: [ANN] VTD-XML Version 1.5 Released


> Jimmy Zhang wrote:
>> Hi, Thanks for the email.
>> My answers to your questions:
>> 1. It is a tradeoff-VTD-XMl consumes more memory, but
>> is easy to use and more powerful, Any random access capable XML 
>> processing API *needs* to at least load the entire hierachical structure 
>> in memory. My take is that among SAX, STAX, DOM
>> and JDOM, vtd-xml is the least likely one to choke, and best one
>> to handle peak loads...
>
> whatever
>
> most XSLT cases *NO NOT* need to load the xml in memory to be able to 
> process it. Unless you abuse xsl:sort or xpaths with .., most things can 
> be done with pure event-driven pipeline style, and only a small buffer 
> needs to be kept in memory.
>
> Xalan XSLTC is able to pre-process xslt stylesheets and compile them into 
> code that will know how much buffer to keep because it knows what kind of 
> xpath events will be called on the incoming stream.
>
>> 2. Agree with you, benchmarking a dummy SAX parser is unfair for VTD-XML,
>> that will make VTD-XML look prettier in real life scenario.
>
> whatever #2, playing smartass (and avoiding the issue that I mentioned) is 
> unlikely to make your points more solid.
>
>> 3. Look at all the vertical industry XML related vocubalry,  SOAP,
>> Rest and XML schema, and infoset data model, DTD seems deprecated
>> a bit, and VTD-XMl doesn't support external entities... other than that
>> VTD-XML is equally capable
>
> I agree that DTDs should be deprecated and seem like an SGML vestigial 
> feature.
>
> My point is that it's unfair to compete with a fully compliant xml parser 
> with a parser that knows how to cut corners (and therefore doesn't have to 
> scan the text for entities to expand!).
>
> if xerces was allowed to get away with no need to parse entities and 
> didn't have to create strings, it would be just as fast as yours.
>
> BTW, you have not answered these questions:
>
>>> You claim xpath random access, but what is the algorithmical complexity 
>>> of that? O(1), O(log(n)), O(n), O(n*log(n))? If one were to store the 
>>> parsed tree index on disk, how many pages would one need to page in 
>>> before reaching the required xpath?
>
> -- 
> Stefano.
>
> 



Mime
View raw message