axis-java-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ias" <>
Subject RE: UTF8Encoder question...
Date Tue, 28 Dec 2004 16:53:34 GMT

	From: Jongjin Choi [] 
	Sent: Tuesday, December 28, 2004 11:56 AM
	Subject: UTF8Encoder question...
	Dims and all, 
	UTF8Encoder writes escaped string when the character is over 0x7F. 
	The escaping does not seem to be necessary because 
	the Writer (not OutputStream) is used. 
	I think this could be just : (line 86)
	instead of : (line 86 ~ 88)
	The escaping just increases the message size.

Yes, it does. However, I think representing a character of which codepoint
is over 0x7F as a form of &#x XML entity is one of the aims of the encoder
because some systems can't display that character properly due to no
unicode-wide fonts built in there. In case it's 100% certain that every node
in a messaging system has no problem with "as-it-is" character
representation on a XML instance, it must be much more efficient to use a
compact encoder as you pointed out instead of UTF8Encoder. Interestingly,
AbstractXMLEncoder (which is not instantiable) works in such a way. In
consequence, it would be a good idea to create a new encoder to optimize
message size and use it with ease of configurability. (Yes, we can recommend
it to users dealing with non-Latin character systems :-)

Happy new year,


P.S. I'm going to switch to (soon,
very soon).

	If the OutputStream is used, the escaping or UTF-8 conversion (which
existed in old will be needed.

View raw message