db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Knut Anders Hatlen <Knut.Hat...@Sun.COM>
Subject Re: Client decoding of server Fdoca data, which direction do we want to go?
Date Fri, 10 Feb 2006 09:04:59 GMT
Kathey Marsden <kmarsdenderby@sbcglobal.net> writes:

> For DERBY-900, Remove use of String(byte[]) and String(byte[], int, int)
> constructors in network client leading to non-portable behaviour
> I am looking at  this method in org.apache.derby.client.am.Sqlca to
> create the string with the proper encoding.
> private String bytes2String(byte[] bytes, int offset, int length)
>             throws java.io.UnsupportedEncodingException {
>         return new String(bytes, offset, length);
>     }
> In this case the Sqlca has read a ccsid that it stores and that needs to
> get translated into a java encoding in order to create the String
> properly.  Client is not currently equipped to make a translation, but
> fact, that translation is always going to turn out to be "UTF-8" because
> the server always sends Fdoca data in UTF-8 encoding.  I could easily
> fix the bug by just hard coding "UTF-8"  in there,  but I  think that,
> as Dan pointed out, the client is being a bit deceptive about what it
> knows by passing the encoding and ccsid around the way it does and of
> course always in the end coming up with the "UTF-8" answer (or in this
> case coming up with no answer at all and having a bug).
> The big question I guess in deciding how to fix this bug is:  What
> direction do we want to go with client decoding the Fdoca data?  
> 1)We can have a  client that fesses up that it knows the answer. In that
> case I'd say we add static variables to Configuration.java for the
> server encoding, reference it in this case and file a Jira to cleanup a
> lot of uneeded, uncovered, and potentially buggy  code in client.
> 2) We have a complete DRDA AR that knows how to do all the proper
> translations, which means we bring the CharacterEncodings class in from
> Network Server to fix this bug and start adding code into client to do
> all the translations properly.
> After looking at this for a while, I  think I would vote for 1, even
> though I fixed DERBY-877 going in the other  direction.  I think having
> a lot of code that can never be covered is not good.  Derby Client is
> for Derby and should be optimized for that and can be made smaller
> cleaner and less deceptive even if it supports a smaller subset of DRDA.

Hi Kathey, if I understand you correctly, your question is: "Can we
assume that the Derby client driver is talking to a Derby network
server?" If we answer yes to that question (which I think we should),
I would go for option 1 as long as the network server always sends
Fdoca as UTF-8.

In my opinion, there is one additional question we need to ask: Do we
always want the network server to send Fdoca as UTF-8, or should it
sometimes use another encoding? Issues with performance or
functionality might make us want to support other encodings. For
instance, UTF-16 is a much better choice for Chinese text because you
need only two bytes per character instead of three bytes per
character. If we think it's likely that we are going to support more
encodings, option 2 sounds like a better choice.

Knut Anders

View raw message