cocoon-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ricardo Rocha <rica...@apache.org>
Subject Re: Aha! got it! 64k limit(was: new version of the sql logicsheet under development)
Date Mon, 28 Aug 2000 17:58:20 GMT
Robin Green wrote:
> 
> No wait - I have a better idea!
> 
> There is no need to write out .ser files. It is inefficient to write them
> out and read them back in again.

Storing constant data (whatever its format) outside the class seems
a good way to overcome the 64k-limit problem. Serialization will add
overhead, granted, but probably not so much as to make XSP pages
noticeably slower. This assumption needs real testing, though...

I feel we need a way to store constant data externally, whatever the
format (serialized string array, serialized DOM, Ozone's persistent
DOM, etc).

> And declaring extra methods for common tasks will not help that much.

I like Ulrich's idea of declaring extra methods. We haven't measured
things yet, but I'm sure such generalization would indeed help: Ed
Staub has reported a class that reaches 80k in code alone.

Of course, there's no single factor that will solve our problem: we
need a combination of optimizations.

> Even if you're in C2 and using SAX, the overhead of SAX->DOM->SAX is
> probably less than SAX->filesystem->SAX. Just pass in an array or Vector of
> literal XML fragments as DOM objects (Elements, DocumentFragments,
> TextNodes, and/or Attributes) to a one-time initialization method in the
> XSPPage class, which then stores them as a field (array or Vector). This is
> thread safe because it is only called once, before first execution. Then the
> populateDocument method can use <xsp:expr>-type code to insert these DOM
> objects directly into the output (cloning them first).

Where would we store such a DOM pool? Wouldn't we need to serialize it
somehow?

> You'd have an optimization process that would attempt to group together
> literal XML into as large as possible clumps that could be inserted at once.
> For example, with this fragment:
> 
>   <a><b><c>hello</c> <xsp:logic>... </xsp:logic>
</b></a>
> 
> you could have just one Element in the literal-vector, like this:
> 
>   <a><b><c>hello</c></b></a>
> 
> with an associated "instruction" to move the currentNode marker to just
> after </c> for the xsp:logic, and then another instruction to pop it back
> out to after </a> after the xsp:logic block had completed.

Sounds cool!

> - so thinking about it a bit more, the literal-vector would not just contain
> DOM Objects, but also little instructions to move the currentNode. But this
> would be easy enough to implement - it's just the optimisation that would
> require a bit of thought.
> Unfortunately it wouldn't be this easy with SAX, which doesn't allow random
> access - with SAX I think you could only have well-formed sequential
> "chunks", not overlapping chunks as in the DOM example above.

A very important distinction is in order here: the generated class will
use
SAX, but we're free to use DOM for code generation.

This is the approach used now. Since DOM allows for random access, it's
possible for the code generator to locate and collect sequential chunks
(both String and XML) for later optimization.

> This will help a lot with large chunks of literal data. But there is still a
> theoretical limit, even then - it's just that with this, I would have
> thought that limit would be far less likely to be reached.

We've hit the 64k limit mostly because of the way we inline strings,
but if we separate data from code (and if we use Ulrich's common-case
"synthetic" methods), we should be able to raise this limit
considerably.

Ricardo

Mime
View raw message