xml-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arnaud Le Hors <leh...@us.ibm.com>
Subject Re: [spinnaker] Announce
Date Sat, 08 Jul 2000 17:56:58 GMT
James Duncan Davidson wrote:
>     * Crimson isn't so optimized, yet it runs about as fast as Xerces
>       does on modern VMs such as HotSpot. The HotSpot team told us
>       that heavily optimized code for 1.1 would not benefit under
>       HotSpot. We have the proof now. In fact, there's cases where
>       it seems that Xerces slows down.

So far the only proof I've got is that Hotspot miserably fails on
Xerces. This means to me that Hotspot has a problem, not xerces.

>     * However, because Xerces was heavily pre-optimized, its
>       extremely complex to understand and delve into. I think
>       that this is best reflected in that most of the bits that
>       go into Xerces come from IBM Cupertino.

Not so. What you're refering to as "IBM Cupertino" is hardly a fixed set
of people. We've actually had a lot of turnover and we keep getting new
people involved in this project all the time. This hasn't prevented any
of them to contribute significantly. The only reason most bits come from
IBM is that nobody else has comitted as many resources to this project.

>     * In our analysis of the Xerces code base, we can't use it for
>       future inclusion in the JDK. The pre-optimization is a killer.
>       The code-complexity is a killer. And the memory consumption is
>       a problem.

There definitely are choices that have been made that could be
revisited. But you make it sound like we never took into account memory
consumption. It is hardly the case. As you know there is always a
trade-off between memory consumption and performance. You may have
different requirements here, but they'd have to be laid out and agreed

> So, in the best of Apache traditions, were gonna do something about it. I'm
> creating a tree in the xml-contrib area in which to do a lot of code work to
> explore how such a new parser could come to be. It's called Spinnaker.

Is it really in the Apache traditions to start new things like that over
a week-end without having any discussion beforehand? Looking at Sun's
record I guess I can see a trend for sure...

>     * Smallest possible size. This means small distribution size (JAR file)
>       and small memory footprint.

These two requirements are in direct conflict. Interestingly enough the
DOM implementation used to be designed to produce the smallest byte code
possible. The complete reorg I have made to it (just a few months after
I got involved in this project btw) had a very different goal: making a
DOM instance smaller in memory. This led me to create many new classes
and sometimes duplicate some code.

> So, to close a few thoughts...
> Q. Isn't this a slam on the Xerces guys?

I say yes. Looks like a "coup d'etat" to me.
Arnaud  Le Hors - IBM Cupertino, XML Technology Group

View raw message