cocoon-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ed Staub" <est...@mediaone.net>
Subject RE: 64k limit (was: new version of the sql logicsheet under development)
Date Mon, 28 Aug 2000 14:58:43 GMT
I looked at one of Ulrich's .class files, and was surprised to find that the
problem is the code, not the string constants.  The populateDocument()
method was over 80K all by itself.

I think that probably the best thing to do is to write methods which
encapsulate the common idioms found in the generated code.  For example it
looks like there's a
	createElement()
	appendChild()
	normalize()
sequence all over the place, which could possibly be bundled into an
appendNewElement() or something similar.

I think these methods probably should be declared final.
This will allow better compiler and JVM optimization, and it's hard to
picture a need to override them.

-Ed Staub

-----Original Message-----
From: Robin Green [mailto:greenrd@hotmail.com]
Sent: Monday, August 28, 2000 11:29 AM
To: cocoon-users@xml.apache.org
Subject: Re: 64k limit (was: new version of the sql logicsheet under
development)


Ricardo Rocha <ricardo@apache.org> wrote:

>As I see it, the whole 64k limit problem is an easily avoidable
>consequence of the approach we chose for code generation. I tend
>to think we can improve on this without resorting to additional
>optimizations.
>
>I don't think any "reasonably sized" XSP document could generate
>64k bytes of code as such.

Indeed - I have my suspicions that the error message being generated is the
wrong one.

>Rather, we're paying a high price for
>the decission to inline constant strings throughout the generated
>code. Our problem is not code: it's data mismanagement. So, the
>solution lies in coming up with mechanisms to separate data from
>code.

Exactly!

>
>Just to illustrate the point, let's imagine our code generator
>separates string constants from code in such a way that constant
>data is serialized in a separate file that is loaded by the
>generated class upon instantiation. Thus, given the input XSP
>document "page.xml" the [Java] code generator would produce 2
>files: "page.java" and "page.ser" where the first contains
>node-generation and user-supplied code only and the second one
>contains the serialization of a string array used to supply
>actual tag names, attribute names/values, inlined text and the
>like.

I was thinking of something similar - the .ser file in my vision would
contain a complete XML tree fragments, for each portion of literal content
(dynamic content would still be generated by the .class file of course) in a
form optimised for fast reading in (unlike normal XML). So you would
probably have 0x01 for "startDocument()", 0x02 for "startElement()" etc so
it would simply be a matter of a switch statement implementing a Finite
State Machine, rather than the complexity of an XML parser like Xerces.

In essence an "XML bytecode"!

>
>This alone might well avoid the 64k-limit problem. Yet, we can
>apply more aggresive optimizations such as one we discussed
>some months ago for SAX: we can drastically reduce string size
>by collecting all common substrings so that SAX "characters()"
>method calls just pass a string array element, an offset and
>a length. This should decrease data size even more.
>
...
>In principle, I oppose the notion of writing our own bytecode
>compiler. Even if we had to fragment larger-than-64k classes
>(which, as I said above, can almost always be avoided) I feel
>we should do so at the source level.
>
>Existing Java compilers do a terrific work in generating
>bytecodes; why reinvent the wheel when we can generate better
>source code and avoid the problems we've introduced by our
>own approach?

I think there is room for a better, open source Java compiler written in
Java, but this is not a job for the Cocoon project.

>Please, my friends, enlighten me about problems and options I
>may not be seeing right now. In particular, I may be being to
>simplistic with the 64k-limit problem: all I see is a problem
>of separating data from code. Is there something I'm missing?

Well I would like to look at the actual XSP that is causing this, so we can
be sure what the problem really is (maybe the error message is misleading).
Uli, or anyone, could you please send me an actual example, the .class file
AND all the files necessary to generate in COcoon. I don't care about the
.class file being invalid - I can use xxd or something.


_________________________________________________________________________
Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com.

Share information about yourself, create your own public profile at
http://profiles.msn.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-users-unsubscribe@xml.apache.org
For additional commands, e-mail: cocoon-users-help@xml.apache.org


Mime
View raw message