axis-java-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eran Chinthaka" <chinth...@opensource.lk>
Subject RE: [Axis2] Binary Serialisation
Date Sun, 31 Jul 2005 07:05:22 GMT
Hi XML Gurus, 

Can I ask a novice question about Binary XML stuff ??. 

Is there any existing implementations of binary protocols with StAX support
? In other words are there any implementations which provides StAX events on
a binary stream.
Once I can remember Mark Pimentel mentioned something like that (Was it Fast
Infoset ??). 

If there is one, can we integrate that to Axis2 and see how much performance
improvement we can gain out of it. I'd love to integrate at least one of the
existing implementations, before Mark finishes his project.

-- Eran Chinthaka 

> -----Original Message-----
> From: Dennis Sosnoski [mailto:dms@sosnoski.com]
> Sent: Friday, July 29, 2005 10:56 AM
> To: axis-dev@ws.apache.org
> Subject: Re: [Axis2] Binary Serialisation
> 
> Looks like we've got a thread going, Eran!
> 
> Dan, I don't think anyone has done a performance analysis for a typed
> parser as such. It'd really need to be done in the context of some sort
> of data binding framework to be meaningful. The only thing which has
> been done along these lines that I'm aware of is Sun's "FAST Web
> Services", which merged mutant forms of JAXB and JAX-RPC so that they
> could do binary input/output. In their case they used ASN.1
> encoding/decoding of the binary data, with the ASN.1 representation
> generated from an XML Schema.
> 
> They saw much faster performance than the conventional JAX-RPC code.
> But, my own JibxSoap (a subproject of JiBX, http://www.jibx.org)
> delivers performance that appears to be about as good while still using
> standard text XML. I say "appears to be" because at the time I did the
> web services performance comparisons
> (http://www.sosnoski.com/presents/cleansoap/comparing.html) the Sun
> stuff was all proprietary. They've since opened it up on java.net, I
> think, though I don't know what kind of license restrictions might apply.
> 
> My own gut feeling is that if I used a typed parser interface for binary
> input/output with JiBX/JibxSoap I could probably get 2-2.5 X the
> processing speed of text (vs. probably about 1.4-1.8 X with my XBIS
> binary XML format, which still keeps values as text and can be
> translated to and from the text representation).
> 
> There are actually some other areas where parser usability could be
> improved, though, besides implementing a typed interface. I think
> implementing a parser that supplied element and attribute names as
> singleton QName objects of some form (rather than separate namespace
> URI, local name, and qualified name text values) would be a big gain,
> for instance. The text APIs could also be better designed; in the case
> of the StAX XMLReader, rather than returning an array plus start offset
> plus length for element content, all using separate method calls, it'd
> be cleaner to just return the equivalent of a JDK 1.5 CharSequence
> (which could be reusable). Likewise on the attribute values, where StAX
> returns Strings. Returning CharSequence-equivalents would not only avoid
> unnecessary String creation (in the case of attribute values), it would
> also eliminate the need to translate the raw byte stream to character
> arrays for common encodings (especially the UTF-8 and UTF-16 used in
> BP-compliant web services).
> 
> Unfortunately, I think developers sometimes misapply Knuth's (or Hoare's
> - I'm not sure who got this started) "premature optimization is the root
> of all evil" aphorism by designing APIs without any thought to
> performance. Once performance bottlenecks have been built into the APIs
> it's very difficult to get around them without scrapping things and
> starting over.
> 
>   - Dennis
> 
> Dan Diephouse wrote:
> 
> > Has anyone done any performance tests (binary or just plan text) with
> > the typed stax stuff? Does it really make a difference?
> > - Dan
> >
> > Eran Chinthaka wrote:
> >
> >> Hi Dennis,
> >>
> >> You have commented on typed pull parser in wiki. Shall we start a
> thread
> >> about it here ?
> >>
> >> -- EC
> >>
> >>
> >>
> >>> -----Original Message-----
> >>> From: Apache Wiki [mailto:wikidiffs@apache.org]
> >>> Sent: Thursday, July 28, 2005 10:31 PM
> >>> To: general@ws.apache.org
> >>> Subject: [Ws Wiki] Update of
> >>> "FrontPage/Axis2/Tasks/BinarySerialization"
> >>> by DennisSosnoski
> >>>
> >>> Dear Wiki user,
> >>>
> >>> You have subscribed to a wiki page or wiki category on "Ws Wiki" for
> >>> change notification.
> >>>
> >>> The following page has been changed by DennisSosnoski:
> >>> http://wiki.apache.org/ws/FrontPage/Axis2/Tasks/BinarySerialization
> >>>
> >>> ----------------------------------------------------------------------
> ----
> >>>
> >>> ----
> >>>  decoding the binary into an int, converting to a string for the
> parser
> >>>  API and then back to an int in the deserialisation code.
> >>>
> >>> + I (DennisSosnoski) would personally disagree with the above
> >>> assessment.
> >>> A typed pull parser would definitely be nice, but even without this
> you
> >>> can get substantial size and performance gains from a binary format.
> >>> See
> >>> my articles on devWorks at http://www-
> >>> 128.ibm.com/developerworks/xml/library/x-trans1.html and http://www-
> >>> 128.ibm.com/developerworks/xml/library/x-trans2/index.html for
> >>> examples.
> >>> +
> >>>
> >>
> >>
> >>
> >>
> >>
> >>
> >
> >




Mime
View raw message