db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel John Debrunner <...@debrunners.com>
Subject Re: [jira] Commented: (DERBY-525) getAsciiStreamshould replace non-ASCII characters with 0x3f, '?' to match embedded
Date Thu, 22 Sep 2005 15:26:30 GMT
Bernt M. Johnsen wrote:

>>>>>>>>>>>>>Daniel John Debrunner (JIRA) wrote (2005-09-22
>>    [ http://issues.apache.org/jira/browse/DERBY-525?page=comments#action_12330193
>>Daniel John Debrunner commented on DERBY-525:
>>See this link for the justifications on why getAsciiStream() uses 8 bits and not 7.
>>Basically, it's based upon definitions from the JDBC spec.
> Ok. But if you map Unicode characters in the range 0x0000-0x00ff to
> 1-byte values without some translation, you get ISO-8859-1 characters,
> not ASCII characters (which only covers the values 0x00-0x7f). I guess
> it's user-friendly, but then the userdoc should explicitely explain
> what is done in a way that is understandable to people who happen know
> what exactly what the different standards define (Europeans and Asians
> tend to be somewhat better educated in this than people from the
> US.... for obvious reasons).

Hey, don't blame me, first I'm not from the US and secondly, this
behaviour is defined by JDBC (and not clearly at that). :-)

To quote JDBC 3.0:

CHAR(code) Character with ASCII code value code, where code is between 0
and 255

So JDBC defines ASCII as codes 0-255, 8 bit, and since this is a JDBC
function we need to follow the JDBC spec.

Technically getAsciiStream() is *not* converting to ASCII characters,
it's converting to encoded bytes that in turn can be converted to ASCII,
or ISO-8859-1 using character encoding. Ideally I think Sun should have
deprecated this method when getCharacterStream was added to JDBC, then
the same (and clearer) functionality would have been provided using
standard Java character encoding.

Or maybe calling it getISO8859_1Stream() would have been a better name!


View raw message