db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kristian Waagan (JIRA)" <j...@apache.org>
Subject [jira] Commented: (DERBY-2346) Provide set methods for clob for embedded driver
Date Wed, 11 Apr 2007 21:23:32 GMT

    [ https://issues.apache.org/jira/browse/DERBY-2346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12488197

Kristian Waagan commented on DERBY-2346:

Regarding the UTF-8 char -> byte -> char conversion using String methods, I don't think
it is a bug. Unmappable "chars" are represented by '?' (0xf3 / 63).
In the snippet above, (char)56249 (0xdbb9) happens to be in a PUA area. These codepoints are
reserved for private use, and the Unicode standard does not define any characters for them.

You could use DataOutput/DataInput and write-/readUTF, but I don't know how efficient this
would be. These methods write the strings to the modfied UTF-8 format, and the equals in the
example above returns true. I think writing your own method would be acceptable, but it would
be interesting if anyone took the time to investigate the cpu/space differences (i.e. what
kind of stream can we use underneath? ByteArrayOutputStream? Subclass of it that returns reference
to the byte array?)

Even though the example uses a "very special codepoint", the database should handle it. An
application could potentially use it for its own custom character (not quite sure how though).
Further, it seems the "UTF-8" encoding (as used in String.getBytes()) does not promise to
encode all unsigned 16 bit values, but only valid Unicode characters.

I'm not very good with the Unicode terminology, so there might be errors in my comment and
maybe important additions. Feel free to correct me.

> Provide set methods for clob for embedded driver
> ------------------------------------------------
>                 Key: DERBY-2346
>                 URL: https://issues.apache.org/jira/browse/DERBY-2346
>             Project: Derby
>          Issue Type: Sub-task
>          Components: JDBC
>    Affects Versions:
>            Reporter: Anurag Shekhar
>         Assigned To: Anurag Shekhar
>         Attachments: derby-2346-only_for_review.diff, derby-2346.v1.diff

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message