Return-Path: Mailing-List: contact axis-dev-help@xml.apache.org; run by ezmlm Delivered-To: mailing list axis-dev@xml.apache.org Received: (qmail 44454 invoked from network); 31 Jan 2001 18:10:17 -0000 Received: from palrel1.hp.com (156.153.255.242) by h31.sny.collab.net with SMTP; 31 Jan 2001 18:10:17 -0000 Received: from omgw1.boi.hp.com (omgw1.boi.hp.com [15.56.8.101]) by palrel1.hp.com (Postfix) with ESMTP id 518FF27A8 for ; Wed, 31 Jan 2001 10:10:20 -0800 (PST) Received: from xrosebh3.rsvl.itc.hp.com (xrosebh3.rsvl.itc.hp.com [15.34.240.67]) by omgw1.boi.hp.com (8.9.3 (PHNE_18979)/8.9.3 SMKit6.0.6 OpenMail) with ESMTP id LAA15802 for ; Wed, 31 Jan 2001 11:10:18 -0700 (MST) Received: by xrosebh3.rsvl.itc.hp.com with Internet Mail Service (5.5.2653.19) id <1A0L0XY5>; Wed, 31 Jan 2001 10:10:15 -0800 Message-ID: From: "MURRAY,BRYAN (HP-FtCollins,ex1)" To: "'axis-dev@xml.apache.org'" Subject: RE: [AXIS ARCH] - Message Internals Date: Wed, 31 Jan 2001 10:10:11 -0800 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2653.19) Content-Type: text/plain; charset="iso-8859-1" X-Spam-Rating: h31.sny.collab.net 1.6.2 0/1000/N I agree that a pull parser is easier to use than either DOM or SAX, because it leaves control in the hands of the parser invoker rather than handing it over to the parser. I also believe it is the only way to achieve the streaming message approach mostly due to the handing over control. SAX has a chance at streaming only if you are willing to call handlers from the event callbacks - this sounds really difficult to control. It is true that multiref arguments will be difficult to handle, but these are likely to occur primarily from the Body and a Body processor will need to read the remainder of the message anyway. Header checking and mustUnderstand validating can be done at the time the headers are parsed - long before the message Body is processed. Some support for delayed processing may need to exist in order to fully support this structure - it does not have to be the mainline for all messages. A way that the digital signature verifier could be accomplished using the streaming approach is to handle the header indicating the digital signature, save away the necessary information to perform the signature verification later, and insert another handler immediately before the body processing which will actually perform the signature verification as it streams the body to the body processor. In order to achieve optimal performance I think we should strive to: read the message bytes no more than once parse the message bytes no more than once traverse the message no more than once keep as little of the message in memory at one time as possible Bryan Murray -----Original Message----- From: James Snell [mailto:jmsnell@intesolv.com] Sent: Tuesday, January 30, 2001 12:31 PM To: 'axis-dev@xml.apache.org' Subject: RE: [AXIS ARCH] - Message Internals Sam, I do think the pull style parser model is best, but I do not think that the streaming message approach will work for SOAP messages for several key reasons: 1. The SOAP specification requires that a determination be made whether or not a message can be processed before it is actually processed. This determination includes checking all of the headers for mustUnderstand and actor attributes. 2. SOAP's use of accessor multireferencing (id/href) allows for forward/backwards/external references that may not be possible in the stream considering the fact that the target of a reference may not have been received into the stream yet. An obvious example of this would be an XML signature verifier where the signature is in the header and the data signed is in the body. If we use the streaming approach, then there is the potential that the signed data won't be available by the time the digital signature verifier is invoked. The only way that I can see to properly support these two items are to defer processing until the entire message is received. - James > -----Original Message----- > From: Sam Ruby [mailto:rubys@us.ibm.com] > Sent: Tuesday, January 30, 2001 6:22 AM > To: axis-dev@xml.apache.org > Subject: RE: [AXIS ARCH] - Message Internals > > > Yuhichi Nakamura wrote: > > > > I just read through this thread. However, I am not sure > > how SAX is useful in the context of SOAP message processing. > > In order to process SOAP messages, we need to "manipulate" > > XML documents in such a way that header entries are removed, > > inserted, and potentially modified. (Body entries might be > > manipulated in the same manner, but at least header entries > > MUST be processed by the Axis engine.) > > It is my intiuition, experience, and reading of the current > literature that > retaining the message in memory is not a scalable solution. > I've cited as > an example the recent cocoon rewrite, and pointed out reference to > Microsoft documentation that indicates that they have hit > upon a similar > problem and outlined their solution. > > Feel free to disagree with the above. It is my point of view, perhaps > there are others out there. > > But if you do see the potential for this being a problem, and > you have any > hope for Axis to be successful and therefore deployed in enterprise > configurations, an alternative must be found. If not now, it will > certainly be done in the *next* rewrite. > > Avoiding discussions of a specific API for a moment, what is > needed is a > streaming model. Headers need to be made available to > handlers as they are > being received. A given handler could choose to do various > things with > this information - pass it along unmodified, choose NOT to > pass it along > (effectively deleting it), create a new header based on > information in the > original. In fact, a handler could easily insert a new > header into the > output stream. > > There are two basic approaches to streaming: a PUSH model, which SAX > represents. Or a PULL model, which some of the APIs which have been > submitted to ECMA for standardization represent. Between these two > alternatives, James seems to favor a pull model. I'm > inclined to agree. > > - Sam Ruby > > >