xml-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andy Clark <an...@apache.org>
Subject XML Pull-Parsing (was: "How to start writing a non-blocking SAX parser")
Date Sun, 05 May 2002 13:55:24 GMT
Alek wrote:
> > First, I think I would prefer a different API for pull parsing.
> > Just from an object oriented standpoint, I don't like having
> > all of the accessor methods on the XmlPullParser interface. I
> > would have chosen to return different event objects. Then the
> > event object would have public fields for its data (to avoid
> > method calls) and specific methods for added functionality.
> that looks like a compromise but is it good one?
> in XMLPULL API the choice was to avoid creation
> of event objects to minimize memory footprint J2ME
> environments making XmlPullParser interface to
> work as specialized iterator.

I would take the same approach that we use in XNI which is
that objects are never orphaned by their creator. For example,
when a handler receives a struct like QName or XMLString in a
callback, it must make a copy of the contents because the
object will be re-used by the component that created it and
passed it along to the handler.

Applying that choice to my pull-parsing API would mean that 
only one event object (of each type) would ever be created. So
the memory footprint is not really an issue.

> better to keep event objects similar to all Java API and
> expose get/set methods instead of public fields.

But if you assume that the method is going to be inlined and
it's not, then you lose some performance. Because pushing and
popping the method call stack takes time. If the data fields
are public on the object returned from "next", then it's
just an object access.

> > But I would also like
> > to have a method that allows me to skip to a start element's
> > end tag, returning all of the text within that element.
> all of those functions can be easily built with XMLPULL API
> and exposed as an utility class instead requiring too detailed
> description of method implementation in interface ...

It's just a question of deciding what functionality is the
most useful. If 90% of the users end up using this convenience
method, then it should be part of the core API. And I think
that this functionality (and some others) are that useful.

> > event queueing. Due to the pipeline nature of the XNI parser
> that sounds like a good engineering decision and i will
> try to implements it in xni2xmlpull - and will make
> implementation more robust :-)

Yep. Let me know if you need any pointers understanding
how the Xerces2 components work within the XNI framework.

> > the character buffers. That way I would not have to copy
> > any characters at all because I would know that the contents
> > of the char buffers would not be over-written.
> that sounds like a great addition to Xerces 2  - i had something similar

I may try to hack up my pull-parsing ideas just to see
how they work. And if I do, then I'll definitely be adding
this feature to Xerces2. In fact, I should add it anyway
because I know other people would use it as well. For
example, way back when the Xalan folks were talking about 
having better control over the char buffers so that they 
didn't constantly have to copy chars around.

It's fun to be able to program what I want again. :)

Andy Clark * andyc@apache.org

In case of troubles, e-mail:     webmaster@xml.apache.org
To unsubscribe, e-mail:          general-unsubscribe@xml.apache.org
For additional commands, e-mail: general-help@xml.apache.org

View raw message