cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefano Mazzocchi <stef...@apache.org>
Subject Re: [RT:Long] Initial Results and comments (was Re: Compiling XML, and its replacement)
Date Fri, 04 Apr 2003 10:59:06 GMT
Berin Loritsch wrote:
> Stefano Mazzocchi wrote:
> 
>> I'll also be interested to see how different the performance gets on 
>> hotspot server/client and how much it changes with several subsequent 
>> runs.
> 
> 
> Well, with HotSpot client and a 15.4 KB (15,798 bytes) test document
> (my build.xml file), I got the following results:
> 
>      [junit] Parsed 873557 times in 10005ms
>      [junit] Average of 0.011453173633775472ms per parse
> 
> Compare that to a much smaller 170 bytes (170 bytes) test document:
> 
>      [junit] Parsed 16064210 times in 10004ms
>      [junit] Average of 6.227508231030347E-4ms per parse
> 
> 
> The two documents are at completely different complexities,
> but the ratio of results is:
> 
>      170b      .000623ms
>   --------- = -----------
>    15,800b      .0115ms
> 
> That's a size increase of 92.9 times
> 
> compared to a time increase of 18.5 times
> 
> 
> Times were comparable to Server Hotspot for this solution--although it
> was only run for 10 seconds.
> 
> Considering we have a 5:1 size to time scaling ratio, it would be
> interesting to see if it carries out to a much larger XML file--
> if only I had one.  If scalability was linear, then a 1,580,000
> byte file should only take .23 ms to parse.

Are you aware of the fact that any java method cannot be greater than 
64Kb of bytecode? And I'm also sure there is a limit on how many methods 
a java class can have.

So, at the very end, you have a top-size limit on how big your 
compiled-in-memory object can be.

> I also tried the test with the -Xint (interpreted mode only) option
> set, and there was no appreciable difference.  As best I can tell,
> this is largely because the code is already as optimized as it
> possibly can be.  This fits in line with your observations of unrolled
> "loops".

Yep.

> In this instance though, I believe that we are dealing with more than
> just "unrolled loops"  We are dealing with file reading overhead, and
> interpretation overhead.  Your *compressed* XML addresses the second
> issue, but in the end I believe it will behave very similarly to my
> solution.

Good point. But you are ignoring the fact that all modern operating 
systems have cached file systems. And, if this was not the case, it 
would be fairly trivial to implement one underneat a source resolver.

> Also keep in mind that improvements in the compiler design (far future)
> can allow for repetitive constructs to be moved into a separate method.
> For instance, the following XML is highly repetitive:
> 
> <demo>
>    <entry name="foo">
>      bar
>    </entry>
>    <entry name="foo">
>      bar
>    </entry>
>    <entry name="foo">
>      bar
>    </entry>
>    <entry name="foo">
>      bar
>    </entry>
>    <entry name="foo">
>      bar
>    </entry>
>    <entry name="foo">
>      bar
>    </entry>
> </demo>
> 
> As documents become very large it becomes critical to do something
> other than my very simplistic compilation.  However there are plenty
> of opportunities to optimize the XML compiler.  For example, we could
> easily reduce the above XML to something along the lines of:
> 
> startElement("demo")
> 
> for (int i = 0; i < 6; i++)
> {
>      outputEntry()
> }
> 
> endElement("demo")
> 
> Even if the attribute values and element values were different,
> but the same structure remained, the compiler would be able
> to (theorhetically) reduce it to a method with parameters:
> 
> startElement("demo")
> 
> outputEntry("foo", "bar");
> outputEntry("ego", "centric");
> outputEntry("gas", "bag");
> outputEntry("I", "am");
> outputEntry("just", "kidding");
> outputEntry("my", "peeps");
> 
> endElement("demo")
> 
> Still allowing for some level of hotspot action.

I see, also to overcome the 64kb method limitation.

> However, I believe the true power of Binary XML will be with its
> support for XMLCallBacks and (in the mid term future) decorators.

Can you elaborate more on this?

> The decorator concept will allow us to set a series of SAX events
> for a common object.  This will render the XSLT stage a moot point
> as we can apply pre-styled decorators to the same set of objects.

Isn't this what a translet (an xsltc-compiled XSLT stylesheet) was 
supposed to be?

Anyway, I'm happy to see new approaches to xml generation being researched.

Stefano.


Mime
View raw message