Return-Path: Delivered-To: apmail-xml-axis-dev-archive@xml.apache.org Received: (qmail 9582 invoked by uid 500); 18 Apr 2001 16:03:53 -0000 Mailing-List: contact axis-dev-help@xml.apache.org; run by ezmlm Precedence: bulk Reply-To: axis-dev@xml.apache.org list-help: list-unsubscribe: list-post: Delivered-To: mailing list axis-dev@xml.apache.org Received: (qmail 9178 invoked from network); 18 Apr 2001 16:03:28 -0000 Received: from whale.cs.indiana.edu (129.79.246.27) by h31.sny.collab.net with SMTP; 18 Apr 2001 16:03:28 -0000 Received: from cs.indiana.edu (whale.cs.indiana.edu [129.79.246.27]) by whale.cs.indiana.edu (8.9.3/8.9.3/IUCS_2.31) with ESMTP id LAA21406; Wed, 18 Apr 2001 11:03:28 -0500 (EST) Message-ID: <3ADDBB78.7DD005D@cs.indiana.edu> Date: Wed, 18 Apr 2001 11:06:16 -0500 From: Aleksander Slominski X-Mailer: Mozilla 4.74 [en] (Windows NT 5.0; U) X-Accept-Language: en MIME-Version: 1.0 To: axis-dev@xml.apache.org CC: soap-dev@xml.apache.org, xerces-j-dev@xml.apache.org Subject: Re: Timings for processing of small messages References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Rating: h31.sny.collab.net 1.6.2 0/1000/N Sam Ruby wrote: > Aleksander Slominski wrote: > > > > i have re-run your tests and added new tests for Xml > > Pull Parser (modified test sources are available at > > http://www.extreme.indiana.edu/~aslom/echosoap/) > > I must say that I am impressed by how fast you were able to do that! thanks :-) > Is that with xpp or xpps? Xml Pull Parser or just XPP (hopefully there are no too many XPPs only XPP Parses Perl' or 'XPML Page Parser, X Printing Panel,X-Windows Phase Plan plus Auto, OASIS XML for Publishers and Printers (XPP), Xyvision's Production Publisher and probably some others that i missed...) > I'm going to take a look into merging the concept of your > FixedLengthInputStream with my NonBlockingBufferedInputStream - the result > should be a savings of a copy of the message in all tests except char and > byte. i have initially tried to do this but it needs much more work to do (see *Streaming.java). i think that it should be improved with keep-alive connections as from our experience socket opening/closing is quite of overhead. it is maybe also worthwhile to try to use DataInputStream to read header lines (it may be more optimized than reading byte by byte). > The name "char" is a misnomer. If you look closely at that test, you will > see that it contains a number of dubious practices that seem common place > throughout much of the current xml-soap code base, such as concatenating > strings into intermediates, and the concatenating the results into larger > strings. Also the XML "parsing" in that test is hardly robust. ;-) then you probably should give it a different name as 'typical' and still have char test to check speed of converting byte stream into reader (character stream). i noticed that in char example you read length from header and use it to read that number of _characters_ - it won't work correctly with anything but 8 bit encodings - and for sure can fail for UTF-8 or UTF-16... > My only criticisms of your tests is that it hardly seems fair to grab the > sixth result from pp.next() - in order to compare apples to apples, I'd > prefer that start tags were read, and it be the text matching the "input" > tag that was extracted, and that all the tags are read. This would make > the xpp results parallel the xni, sax, and dom results. I realize that a > case could be made that with a pull parser there is no requirement to read > to the end, but I suspect that in real soap stacks the entire message will > be relevant. hey you are also grabbing just the value from SAX event stream! however XPP is _reading_ all tags and maintaining namespaces! just that in this test this information is silently discarded (as it is in DOM test). i think that adding readStartTag() or running XPP/SAX 1.0 (xpp2sax) will not have much influence on test results. BTW xpp2sax allows for both pull and push parsing with the same input - you can start SAX parse() on any start tag (and you could even build real DOM from it or some more specialized tree representation like electric xml - i also plan to add SAX2.0 driver to XPP...). > > i am suspecting that tests are spending too much time > > doing buffering and memory IO including socket connection > > and disconnection that heavily affects performance - it > > would be interesting to see tests that uses HTTP keep-alive > > and allows for streaming of input into parsers (and not > > buffering it). however it requires very careful coding that > > will not introduce unnecessary buffering and delays...( it > > would be also interesting to do some testing with chunked > > encoding but this is even more difficult...). > > If the goal were to simply compare parsers, the I would actually eliminate > all HTTP and socket overhead. I guess my question is: is a steady stream > of tiny messages from a single client actually what we want to optimize > for? My reason for closing the socket and getting a new one each time was > to mirror what I presume would be closer to real world usage - namely a > server that gets messages from a large number of clients. if messages are small I/O efficient becomes paramount and keep-alive makes a lot of sense or any application code (such as XML parsing) will be only fraction of actual tested time ... when you have many clients it is more fair testing as IO becomes less of bottleneck (having opened multiple persistent/keep-alive input streams server can multiplex IO waits). > My hope was that given the large amount of overhead, the time to parse this > tiny message should approach the noise range. From the looks of things, I > would say that xpp achieves this (modulo the concerns above), and hopefully > by eliminating the need for the ByteArrayInputStream by incorporation of > the FixedInputStream concept might make things look even better. please make sure that you have very efficient HTTP header reading ot it will be consuming more time than XML envelope reading... > And then we could move on to more substantial messages... that would be really interesting (especially checking with streaming on top of HTTP/1.1 chunking). alek ps. if you want to time just small parts of test you may want to use high-resolution timer to determine exact time for each phase. i have one that i udapted form some JavaWorld article that works both on Windows and Solaris so if you are interested i can pack it and put on the web (it is using JNI to access small C routines that tap into system high-resolution clock). -- Aleksander Slominski, LH 316, IU, http://www.extreme.indiana.edu/~aslom As I look afar I see neither cherry Nor tinted leaves Just a modest hut on the coast In the dusk of Autumn nightfall - Fujiwara no Teika(1162-1241)