axis-java-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From axis-...@ws.apache.org
Subject [jira] Commented: (AXIS-1498) AXIS2: pluggable XML transformations (JavaBeans2XML, XSD with JaxMe/JAXB, Castor, XmlBeans, RelaxNG, ...)
Date Wed, 11 Aug 2004 04:06:20 GMT
The following comment has been added to this issue:

     Author: Srinath Perera
    Created: Tue, 10 Aug 2004 9:05 PM
       Body:
This is a dicussion I had with Jochen form JAXME  in the jaxme -dev. it talks about Axis and
(pull parser vs SAX).

Ideally inside a SOAP engine the encoding stuff (Serialization and
desiralization of the web serrvice parameters) should be done by the
java/XML data binding tool.

1) How the encoding (java<->XML) stuff in side JAXME is implemented.
2) Does it load complete XML to the memory
3) e.g. one Ideal case would be a JAXME Impl use xml pul for encoding.
(Can the JAXME encoding is extensible .. mean If differant encoding is
needed can the user extends the exsisting classes and have his own
encoidng.)

I am looking at the possiblity of merging the two technologies .. and any
thoughts are welcome.

Robert, I saw you looking at the same directions at the Axis dev mailing
list so I cc you :) .
Thanks
Srinath
----------------------------------------------------------
>Ideally inside a SOAP engine the encoding stuff (Serialization and
>desiralization of the web serrvice parameters) should be done by the
>java/XML data binding tool.
>
>1) How the encoding (java<->XML) stuff in side JAXME is implemented.
>
Reading and writing is based on SAX. The design was with SOAP request 
handling and similar stuff in mind.
> 2) Does it load complete XML to the memory
>  
>
It needs the object being serialized in memory. However, the design was 
written with streaming in mind: When parsing SAX events, you need 
nothing in memory, except the object being generated.

>3) e.g. one Ideal case would be a JAXME Impl use xml pul for encoding.
>(Can the JAXME encoding is extensible .. mean If differant encoding is
>needed can the user extends the exsisting classes and have his own
>encoidng.)
>  
>
XML Pull parsers are a no. While this could be technically done, it 
would require the burden to support two *very* different technologies 
(SAX and XML Pull) in one application. (The JAXB specification mandates, 
that SAX be supported.) However, I see no problem, if Axis uses XML 
pull: Simply convert the tokens being pulled into SAX events.
Jochen
-----------------------------------------------------------------
>but with SAX inside Axis that is not true. Axis has two parts of data
>(Header/Body) that other processing should be come in between parsing of
>the both. Since with SAX once you started we got to go to the end we have
>to Record the SAX events ... which can be expensive. That is a reason for
>looking at pull.
>  
>

Sorry, if I disagree. It is a straightforward and standard technique, to
create (for example) one SAX parser for SOAP header and the outer part
of the body, which is creating other SAX parsers on demand and
delegating events to them. For example, JaxMe uses this technique to
have one SAX parser, that detects the object type and another SAX
parser, which is invoked for the first and parses the actual XML. The
same technique is also used to have a common root object for an array of
XML documents being parsed.

I do not know, whether Axis supports this. It definitely should, IMO.
However, given the situation with Axis and streaming output, I doubt.
Jochen
---------------------------------------------
thanks for point it out. I am quite sure axis do not do this. (I have not
check the code to verify it yet.)

1)If you do not mind please direct me a link or code that give bit more
info about the tecnique.
2) do not this tecqniques involves other overheads? just checking :)

and yet the pull parser is much more natural and give the complete control
of the parsing to the Axis. plus it can start parsing even before the end
tag is avalible which I belive to be big plus in network enviorment.

I belive it is faster than SAX when we give the control to the parser in
the artifical way where as pullparser has it naturally.

Thanks
Srinath

------------------------------------------------------
>1)If you do not mind please direct me a link or code that give bit more
>info about the tecnique.
>  
>
I think, an example does best. See below.

>2) do not this tecqniques involves other overheads? just checking :)
>
>and yet the pull parser is much more natural and give the complete control
>of the parsing to the Axis. plus it can start parsing even before the end
>tag is avalible which I belive to be big plus in network enviorment.
>
>I belive it is faster than SAX when we give the control to the parser in
>the artifical way where as pullparser has it naturally.
>  
>
In general, you have the same memory profile, the same overhead, and the 
same speed as when using a pull parser. Pull parsers are said to be 
slightly faster (at least Aleksander Slominski says so), but I am not 
sure, if this isn't just for the additional features that mainstream SAX 
parsers like Xerces have.

SAX has a serious disadvantage: Writing a SAX parser is definitely more 
complex that writing pull parsers. In particular, this applies for 
manually written parsers. The situation is quite comparable to LL(k) 
parsers vs. LR(k) parsers in compiler theory. On the other hand, SAX has 
a serious advantage: It is *the* standard and a lot of tools (JaxMe just 
one of them) support it.

In general, I would say, that the decision for pull parsers doesn't 
forbid supporting SAX, at least not at the user interface level. For 
example, one could very well parse the SOAP body and the SOAP headers 
root tag using a pull parser. On demand of the user, one could let the 
pull parser create a SAX handler internally and convert all following 
tokens into SAX events, which are delegated to the handler, until the 
SOAP headers end tag is found.

I'll try to sketch the idea with the following: Suggest a SOAP message 
like this

    <env:Envelope xmlns:env="http://www.w3.org/2003/05/soap-envelope">
        <env:Header>
            <t:message xmlns:t="..." handler="com.foo.StringArrayHandler"/>
        </env:Header>
        <env:Body>
            <m:string value="1"/>
            <m:string value="2"/>
            <m:string value="3"/>
            ...
            <m:string value="1000000"/>
        </env:Body>
    </env:Envelope>

Obviously, one would not like to put the whole message into memory.  The 
main idea of the whole thing is, that the StringArrayHandler is an 
implementation of org.xml.sax.ContentHandler. Let's have a look at the 
StringArrayHandler first:

    public class StringArrayHandler extends org.xml.sax.DefaultHandler {
       public void startElement(String pNamespaceURI, String pLocalName, 
String pQName, Attributes pAttr) throws SAXException {
          if ("string".equals(pLocalName)) {
             // Print the 'value' attribute
             System.out.println("Value = " + pAttr.getValue("value"));
          }
       }
    }


A SAX handler for processing the message could look like this:

    public class SoapEnvelopeHandler extends 
org.xml.sax.helpers.DefaultHandler {
       private int level = 0;
       private boolean inHeader, inBody;
       private ContentHandler handler;
       public void startElement(String pNamespaceURI, String pLocalName, 
String pQName, Attributes pAttr) {+
          switch (level++) {
             case 0: // Envelope tag, ignore it here
                break;
             case 1; // Header or Body tag
                if ("Header".equals(pLocalName)) {
                   inHeader = true;
                } else if ("Body".equals(pLocalName)) {
                   inBody = true;
                   handler.startDocument();
                   handler.startElement(pNamespaceURI, pLocalName, 
pQName, pAttr);
                 }
                  break;
             case 2:
                  if (inHeader) {
                       if ("message".equals(pLocalName)) {
                            try {
                                   handler = (ContentHandler) 
Class.forName(pAttr.getValue("handler")).newInstance();
                            } catch (Exception e) {
                               throw new SAXException(e);
                            }
                      }
                   } else if (inBody) {
                      handler.startElement(pNamespaceURI, pLocalName, 
pQName, pAttr);
                   }
                   break;
                default:
                   if (inBody) {
                      handler.startElement(pNamespaceURI, pLocalName, 
pQName, pAttr);
                   }
                   break;
            }
        }
        public void endElement(String pNamespaceURI, String pLocalName, 
String pQName) {
             if (inBody) {
                handler.endElement(pNamespaceURI, String pLocalName, 
String pQName);
             }
             if (--level == 1) {
                if (inBody) {
                   handler.endDocument();
                }
                inHeader = inBody = false;
             }
       }
       public void characters(char[] pBuffer, int pOffset, int pLen) {
          if (inBody) {
             handler.characters(pNamespaceURI, pLocalName, pQName);
          }
       }

    }

The example is quite typical, using state variables and the like. 
Writing an equivalent pull parser is trivial, much shorter and better 
readable. However, it *can* be done with SAX. And you have got it using 
SAX, you're the winner.

Finally, note again, that the use of the "handler" can very well be done 
by a pull parser as well.


Jochen
--------------------------------------------------
need to ask a one question * I need to
1) parse the SOAP Headers
2) pause the parser
3) process the SOAP Headers
4) Parse the SOAP body
5) process the SOAP body

To me the step #2 pause is the critical point. I do not grasp how it is
happen at the example you given. (this is needed becouse Headers and Body
need to be procesed at two places and if we can not do it we need to recod
SAX event and in trouble.)

Thanks for thoughts :)
Srinath




---------------------------------------------------------------------
View this comment:
  http://issues.apache.org/jira/browse/AXIS-1498?page=comments#action_37156

---------------------------------------------------------------------
View the issue:
  http://issues.apache.org/jira/browse/AXIS-1498

Here is an overview of the issue:
---------------------------------------------------------------------
        Key: AXIS-1498
    Summary: AXIS2: pluggable XML transformations (JavaBeans2XML, XSD with JaxMe/JAXB, Castor,
XmlBeans, RelaxNG, ...)
       Type: Wish

     Status: Unassigned
   Priority: Major

    Project: Axis
 Components: 
             Basic Architecture

   Assignee: 
   Reporter: Aleksander Slominski

    Created: Fri, 6 Aug 2004 10:36 AM
    Updated: Tue, 10 Aug 2004 9:05 PM
Environment: ALL

Description:
allow to plug different XML<->Java object model transformations (and maybe also allow
them to be pipelined!)

that would make XML processing much more flexible and would allow users to choose how their
XML messages should be processed



---------------------------------------------------------------------
JIRA INFORMATION:
This message is automatically generated by JIRA.

If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa

If you want more information on JIRA, or have a bug to report see:
   http://www.atlassian.com/software/jira


Mime
View raw message