axis-java-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James M Snell" <jasn...@us.ibm.com>
Subject Re: process (was: Re: The Great Debate: Xml Parsers)
Date Thu, 22 Mar 2001 16:45:08 GMT
Sanjiva,

That's why I opened up the discussion.  There was too much non-public 
discussion between the majority of committers regarding this issue so I 
decided to make it public. 

- James Snell
     Software Engineer, Emerging Technologies, IBM
     jasnell@us.ibm.com (online)
     jsnell@lemoorenet.com (offline)

Please respond to axis-dev@xml.apache.org 
To:     <axis-dev@xml.apache.org>
cc: 
Subject:        process (was: Re: The Great Debate: Xml Parsers)



I'd like to register a concern with the process that the Axis team
is using. I am not an active participant, but I do have much interest
in Axis.

There was a long discussion early on about streaming / non-streaming
and then the impl was done with DOM. The after the face-to-face involving
a small group, suddenly it was switched to JDOM. Now James' note says
that we've pretty much decided to go to something else. Then Glen says
"um, I didn't agree with that". Given the recent vote for Glen as the
project manager (which I fully support with a +100), that's a major
breakdown in the process .. the note even went to Xerces folks!

ARGH.

Speaking for myself, I don't view this process as being very open
or very constructive. The DOM -> JDOM stuff was done without much open
discussion. Now there seems to be an effort to move from JDOM without
much (any?) open discussion.

Can we please open up a bit more?

Architecturally, I'd like to see us using SAX (and DOM, where needed).
I know using SAX is hard as hell for this, but one of the main reasons
for the total re-write is to address hard issues.

Sanjiva.

----- Original Message -----
From: <gdaniels@allaire.com>
To: <axis-dev@xml.apache.org>
Sent: Wednesday, March 21, 2001 5:32 PM
Subject: RE: The Great Debate: Xml Parsers


>
> >    1  Axis must not force the entire message object model to
> > be in memory
> > at one time.  In other words, DOM is out.
>
> OK, hang on a sec.  There are some pretty massive concerns around 
dealing
> with any kind of streaming model, concerns which I don't believe have 
been
> adequately addressed yet.  Until we resolve how we're building this, and
> what the object model for the messages really looks like, I am not
> personally ruling out using DOM or something much like it.
>
> >From my point of view, it is MUCH more important to get v1.0 out the 
door
> than it is to get *all* of the requirements met.  In particular I've 
been
> thinking about this one, and frankly I'm willing to give it up and just 
use
> JDOM/DOM internally if that gets us a working engine in the nearer term.
> This is not to say I don't support the goal, I just don't see it 
happening
> yet and I'm more leaning towards an "extreme programming" type viewpoint 
on
> this project; get v1.0 out, collect feedback, refactor for v2.  I'm 
willing
> to be convinced otherwise, if we can make good progress.
>
> >    2  Axis must be very fast and very scalable in order to be widely
> > adopted over other Web Service implementation platforms
>
> Yes, although what "fast enough" and "scalable enough" mean is somewhat 
open
> to debate.
>
> >    3  We must be able to independently parse individual
> > elements of the
> > message either as raw bits, SAX, the Axis defined Message API, DOM or
> > whatever else the user wants.
>
> OK, yes.  +1!
>
> >    4  We must be able to fully support SOAP semantics (i.e. multiref
> > elements, id/href, etc) without an overly negative impact on
> > performance
> > (see number 1 and 2)
>
> Yeah baby!
>
> > We've looked at Xerces, we've looked at JDOM, and most
> > recently I've been
> > doing some work with a new Xml Pull Parser developed originally by
> > Aleksander Slominski as part of a research project for
> > Indiana Univ. Below
> > is a basic summary of our thoughts thus far:
> >
> > Xerces 1.x ->  Our concern with Xerces 1.x DOM is that it is
> > slow, huge,
> > and complicated.  These are the standard complaints with DOM
> > that we've
> > all heard (note to the Xerces guys:  I eagerly await the release of
> > Xerces2 ! :-) ....)  It just won't scale well in the types of
> > environments
> > that we foresee Axis being deployed (which include limited capacity
> > devices such as handhelds (in which case it probably wouldn't
> > work at all
> > due simply to it's size).
> >
> > We also looked at SAX as an alternative but quickly
> > determined that SAX
> > just was not adequate for proper SOAP processing that also met the
> > requrements mentioned above.  (for those of you who weren't
> > part of that
> > discussion, I will not rehash it here, ping me later and I'll
> > give you the
> > rundown).
> >
> > JDOM -> Whlie JDOM is smaller and faster than Xerces and DOM,
> > which is
> > nice, it still does not meet our requirements listed above.
> > An additional
> > issue raised internally at IBM was that JDOM is nowhere near being a
> > standard yet.  (As some of you may know, the current Axis
> > codebase uses
> > JDOM for it's message processing).  We've all pretty much
> > decided already
> > that JDOM should be removed from the core and should be
> > replaced with a
> > lightweight XML parser that meets the requirements.
>
> Just speaking for myself, I haven't decided that yet.
>
> > Xml Pull Parser (XPP) -> XPP is a lightweight (23k) pull
> > parser that is
> > completely namespace aware and XML 1.0 compliant.  It's
> > interface needs
> > quite a bit of work so I've been working with the author on
> > getting it
> > cleaned up.  XPP has two advantages: 1. it's small, 2. it's
> > fast.  The
> > parser was originally implemented as part of a research
> > project comparing
> > the performance of various parsers in relation to
> > SOAP-deserialization.
> > I'll have to try to dig up the results of their tests again, but XPP
> > outperformed nearly everything else available.   XPP would
> > meet each of
> > our requirements once the interface redesign is complete.
> > This interface
> > redesign includes building a SAX layer over the parser's primary
> > interface.
> >
> > Now, here's what we need to decide:
> >
> > Which is more important: Performance/Scalability or Standards support?
>
> My opinion - if you can get the same product out, and it meets the goals
> outlined above, with either but not both of these things, I'd certainly 
pick
> performance/scalability.  However, as mentioned above, getting the 
product
> out is priority 1.
>
> > From earlier decisions, I believe that we have agreed that
> > performance and
> > scalability in the case of Axis far outweigh standards
> > support within the
> > core engine itself as long as there are hooks specifically
> > designed into
> > the engine that allow full standards support if the developer
> > wishes it.
> > Thus the reason we were going to provide our own Axis Message
> > API with
> > hooks for optionally processing the message with SAX or DOM.
> > (i.e. if the
> > developer wants to tank their performance by using DOM, so be it)
>
> +1
>
> > I would like to invite the Xerces guys to join this
> > discussion so that we
> > may figure out how to resolve this issue.  I understand now
> > that Xerces 2
> > includes a Pull Parser interface of it's own along with a low level
> > interface that enables modularization, but many of us here
> > either haven't
> > heard of it yet or aren't quite sure what it could mean for
> > Axis.  Could
> > anybody on the Xerces team explain this in greater depth for us?
> >
> > - James Snell
> >      Software Engineer, Emerging Technologies, IBM
> >      jasnell@us.ibm.com (online)
> >      jsnell@lemoorenet.com (offline)
> >




Mime
View raw message