axis-java-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aleksander Slominski <as...@cs.indiana.edu>
Subject Re: Parsing stuff
Date Thu, 19 Apr 2001 05:26:05 GMT
hi,

that is very interesting - you have implemented:

    {[ push (XML tokenizer, low level parser)
    -> SAX driver]
    -> pushed events (in separate thread)
    -> restored pull table (SAX2EventRecorder)}
    -> user application can pull events through Message and wait until they arrive.

so, in my opinion, this is a very sophisticated way of converting push into pull
parser that is coupled into DOM like tree structure represented through list of
recorded events in MessageElement. this is pull parsing as you need to wait until all
events are available to move to next header or body element :-)

James M Snell wrote:

> This code gives us:
>
>     1. Streaming support

i wonder what is "streaming" in this context? does it mean that content is available
as fast as it is arriving?

or do you mean that not a whole message does not need to be in memory? in this case it
is not really streaming as recorded event stream is _completly_ stored in various
MessageElement so it is not possible to parse message of unlimited size in limited
memory (and that is my personal definition of streaming:-))

>
>     2. Lazy parsing

i am not sure what do you mean by lazy parsing - to access a soap body element you
still need to get all headers parsed - right?

and as it is happening in separate thread that _already_ started parsing user code
does not control what gets parsed - even if it does not care about content of
particular headers they will still contain recorded events... (it is still much better
than requiring all content to be available as DOM).

there is a special issue that you need to take special care to allow MessageElement
recorded SAX event stream to be converted into DOM2 - you will need to have all prefix
mappings available to DOM2 so you basically should be replaying all prefix mapping
including all parent elements (for example in Header you need to give SOAP-ENV prefix
mapping or DOM2 will fail..).

>     3. Much better performance (for a simple SOAP envelope with 2 headers
> and a single body element, I was noticing speeds averaging about 10
> milliseconds for each loop in the code contained in the Main.java class
> (see the zip file) compared to an average of 40-50 milliseconds per loop
> using a DOM approach (I'm using Xerces2 in nonvalidating mode, btw).

did you compare speed of this approach with bare bone SAX2?

thanks,

alek

> I wrote this last night between 11:00pm and 2:00am so please be gentle
> with your flames ;-)

ps.  just few remarks (you mentioned that it was quickly implemented and put
together...):

* it seems that XML input file you used for testing in Main is missing form zip file

* i think that in Main those two lines should be inverted:
            Message m = new Message(fis, smf);
            s = System.currentTimeMillis();
as in Message constructor you start new thread that will be already paring input
before you call System.currentTimeMillis()

also as this system timer is very low grained it may be recording 0 milliseconds...

* just small other comment: in QName when comparing strings you need to use equals()
unless you guarantee that all strings will be always interned or uniquely kept in your
SymbolTable (and i thin you use in Main new QName("body", "body") :-) )

(...)
        return (p1.getNamespaceURI() == namespaceURI &&
                p1.getLocalPart() == localPart);
(...)
--
Aleksander Slominski, LH 316, IU, http://www.extreme.indiana.edu/~aslom
As I look afar I see neither cherry Nor tinted leaves Just a modest hut
on the coast In the dusk of Autumn nightfall - Fujiwara no Teika(1162-1241)



Mime
View raw message