Return-Path: Mailing-List: contact axis-dev-help@xml.apache.org; run by ezmlm Delivered-To: mailing list axis-dev@xml.apache.org Received: (qmail 24388 invoked from network); 1 Feb 2001 02:02:55 -0000 Received: from georgia.yamato.ibm.com (203.141.89.181) by h31.sny.collab.net with SMTP; 1 Feb 2001 02:02:55 -0000 Received: from e022f2n9.jp.ibm.com (d22relay01.yamato.ibm.com [9.68.14.52]) by georgia.yamato.ibm.com (8.11.1/3.7W/NG3.6) with ESMTP id f1122Zx193324 for ; Thu, 1 Feb 2001 11:02:35 +0900 Received: from e022f2n7 (d22hubm7 [9.68.14.51]) by e022f2n9.jp.ibm.com (8.8.8m3/NCO v4.95) with ESMTP id LAA108520 for ; Thu, 1 Feb 2001 11:02:31 +0900 Importance: Normal Subject: RE: [AXIS ARCH] - Message Internals To: axis-dev@xml.apache.org X-Mailer: Lotus Notes Release 5.0.3 14 April 2000 Message-ID: From: "Yuhichi Nakamura" Date: Thu, 1 Feb 2001 11:02:26 +0900 X-MIMETrack: Serialize by Router on D22HUBM7/22/H/IBM(Release 5.0.3 (Intl)|21 March 2000) at 2001/02/01 11:02:34 AM MIME-Version: 1.0 Content-type: text/plain; charset=us-ascii X-Spam-Rating: h31.sny.collab.net 1.6.2 0/1000/N Hi Bryan, I just want to make sure what the pull parser means. Do you indicate a particular functional parser or just a concept? I thought that DOM is a pull parser, and SAX is a push parser in this context. Maybe I am wrong. Please correct me. For Digital Signature, there exists a DOM-based tool (very stable) on IBM alphaWorks (actually, it comes from our team). Do you really want to develop yet another dig-sig tool in this project? I think that we need to adopt existing "stable" modules as much as possible. Your items for perfomance are very adequate. I would ask: Do we have such parser, or do we develop such parser in this project. IMHO, we should not assume things that does not exist. Axis engine should be developed on top of existing techonoloies, therefore we should not reinvent similar things in this project. At this moment, I feel that Xerces is the most appropriate for the parser stuff. Regards, Yuhichi Nakamura IBM Research, Tokyo Research Laboratory Tel: +81-46-215-4668 FAX: +81-46-215-7413 From: "MURRAY,BRYAN (HP-FtCollins,ex1)" on 2001/02/01 03:10 Please respond to axis-dev@xml.apache.org To: "'axis-dev@xml.apache.org'" cc: Subject: RE: [AXIS ARCH] - Message Internals I agree that a pull parser is easier to use than either DOM or SAX, because it leaves control in the hands of the parser invoker rather than handing it over to the parser. I also believe it is the only way to achieve the streaming message approach mostly due to the handing over control. SAX has a chance at streaming only if you are willing to call handlers from the event callbacks - this sounds really difficult to control. It is true that multiref arguments will be difficult to handle, but these are likely to occur primarily from the Body and a Body processor will need to read the remainder of the message anyway. Header checking and mustUnderstand validating can be done at the time the headers are parsed - long before the message Body is processed. Some support for delayed processing may need to exist in order to fully support this structure - it does not have to be the mainline for all messages. A way that the digital signature verifier could be accomplished using the streaming approach is to handle the header indicating the digital signature, save away the necessary information to perform the signature verification later, and insert another handler immediately before the body processing which will actually perform the signature verification as it streams the body to the body processor. In order to achieve optimal performance I think we should strive to: read the message bytes no more than once parse the message bytes no more than once traverse the message no more than once keep as little of the message in memory at one time as possible Bryan Murray -----Original Message----- From: James Snell [mailto:jmsnell@intesolv.com] Sent: Tuesday, January 30, 2001 12:31 PM To: 'axis-dev@xml.apache.org' Subject: RE: [AXIS ARCH] - Message Internals Sam, I do think the pull style parser model is best, but I do not think that the streaming message approach will work for SOAP messages for several key reasons: 1. The SOAP specification requires that a determination be made whether or not a message can be processed before it is actually processed. This determination includes checking all of the headers for mustUnderstand and actor attributes. 2. SOAP's use of accessor multireferencing (id/href) allows for forward/backwards/external references that may not be possible in the stream considering the fact that the target of a reference may not have been received into the stream yet. An obvious example of this would be an XML signature verifier where the signature is in the header and the data signed is in the body. If we use the streaming approach, then there is the potential that the signed data won't be available by the time the digital signature verifier is invoked. The only way that I can see to properly support these two items are to defer processing until the entire message is received. - James > -----Original Message----- > From: Sam Ruby [mailto:rubys@us.ibm.com] > Sent: Tuesday, January 30, 2001 6:22 AM > To: axis-dev@xml.apache.org > Subject: RE: [AXIS ARCH] - Message Internals > > > Yuhichi Nakamura wrote: > > > > I just read through this thread. However, I am not sure > > how SAX is useful in the context of SOAP message processing. > > In order to process SOAP messages, we need to "manipulate" > > XML documents in such a way that header entries are removed, > > inserted, and potentially modified. (Body entries might be > > manipulated in the same manner, but at least header entries > > MUST be processed by the Axis engine.) > > It is my intiuition, experience, and reading of the current > literature that > retaining the message in memory is not a scalable solution. > I've cited as > an example the recent cocoon rewrite, and pointed out reference to > Microsoft documentation that indicates that they have hit > upon a similar > problem and outlined their solution. > > Feel free to disagree with the above. It is my point of view, perhaps > there are others out there. > > But if you do see the potential for this being a problem, and > you have any > hope for Axis to be successful and therefore deployed in enterprise > configurations, an alternative must be found. If not now, it will > certainly be done in the *next* rewrite. > > Avoiding discussions of a specific API for a moment, what is > needed is a > streaming model. Headers need to be made available to > handlers as they are > being received. A given handler could choose to do various > things with > this information - pass it along unmodified, choose NOT to > pass it along > (effectively deleting it), create a new header based on > information in the > original. In fact, a handler could easily insert a new > header into the > output stream. > > There are two basic approaches to streaming: a PUSH model, which SAX > represents. Or a PULL model, which some of the APIs which have been > submitted to ECMA for standardization represent. Between these two > alternatives, James seems to favor a pull model. I'm > inclined to agree. > > - Sam Ruby > > >