xml-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aleksander Slominski <as...@cs.indiana.edu>
Subject Re: XML Pull-Parsing (was: "How to start writing a non-blocking SAX parser")
Date Sun, 05 May 2002 17:03:45 GMT
Andy Clark wrote:

> I would take the same approach that we use in XNI which is
> that objects are never orphaned by their creator. For example,
> when a handler receives a struct like QName or XMLString in a
> callback, it must make a copy of the contents because the
> object will be re-used by the component that created it and
> passed it along to the handler.
>
> Applying that choice to my pull-parsing API would mean that
> only one event object (of each type) would ever be created. So
> the memory footprint is not really an issue.

agreed as then those objects are simply containers to ass in/out
arguments and it is very similar to C/C++ as user will need to make a copy
if want to keep object values longer than time of one callback.

the interesting thing with immutable objects (such as element name
represented as interned String) is that they can be kept indefinitely
and shared very efficiently between parser and the user code

however for other level objects such as start tag even the
benefits, as you describe, are not that clear ...


> > better to keep event objects similar to all Java API and
> > expose get/set methods instead of public fields.
>
> But if you assume that the method is going to be inlined and
> it's not, then you lose some performance. Because pushing and
> popping the method call stack takes time. If the data fields
> are public on the object returned from "next", then it's
> just an object access.

that makes java programming slightly more lower level
and fell more like C/C++ :-)

> > all of those functions can be easily built with XMLPULL API
> > and exposed as an utility class instead requiring too detailed
> > description of method implementation in interface ...
>
> It's just a question of deciding what functionality is the
> most useful. If 90% of the users end up using this convenience
> method, then it should be part of the core API. And I think
> that this functionality (and some others) are that useful.

the problem is to resist temptation of adding too much too fast,
we think that in XMLPULL API we have some useful methods
(like nextText/nextTag) and i personally think that more is needed
but it is good to wait a bit and see what is _really_ needed.

> > > event queueing. Due to the pipeline nature of the XNI parser
> >
> > that sounds like a good engineering decision and i will
> > try to implements it in xni2xmlpull - and will make
> > implementation more robust :-)
>
> Yep. Let me know if you need any pointers understanding
> how the Xerces2 components work within the XNI framework.

thanks! i have read Xerces2 code and i have general grasp of its
working (in general ...)

> > > the character buffers. That way I would not have to copy
> > > any characters at all because I would know that the contents
> > > of the char buffers would not be over-written.
> >
> > that sounds like a great addition to Xerces 2  - i had something similar
>
> I may try to hack up my pull-parsing ideas just to see
> how they work. And if I do, then I'll definitely be adding
> this feature to Xerces2. In fact, I should add it anyway
> because I know other people would use it as well. For
> example, way back when the Xalan folks were talking about
> having better control over the char buffers so that they
> didn't constantly have to copy chars around.

i think it is good for perfromance and will make code
writing text gathering for element content much easier...

> It's fun to be able to program what I want again. :)

what can i say :-))))

alek




---------------------------------------------------------------------
In case of troubles, e-mail:     webmaster@xml.apache.org
To unsubscribe, e-mail:          general-unsubscribe@xml.apache.org
For additional commands, e-mail: general-help@xml.apache.org


Mime
View raw message