db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rick Hillegas (JIRA)" <j...@apache.org>
Subject [jira] Commented: (DERBY-2694) org.apache.derby.impl.drda.DDMWriter uses wrong algorithm to avoid spliting varchar in the middle of a multibyte char.
Date Thu, 31 May 2007 23:41:15 GMT

    [ https://issues.apache.org/jira/browse/DERBY-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12500527
] 

Rick Hillegas commented on DERBY-2694:
--------------------------------------

Hi, Anurag. I think that the patch does the right thing. However, it's a little tricky to
read. I think that the following approach is easier to understand. What do you think? I'm
not an expert on utf-8 encoding, but the following web page was useful to me: http://www.unix.org.ua/orelly/java/fclass/appb_01.htm

private static final byte MULTI_BYTE_MASK = (byte) 0xC0;
private static final byte CONTINUATION_BYTE = (byte) 0x80;

if (writeLen != origLen) // if we're truncating the string
{
    while ( isContinuationChar( byteval[ writeLen ] ) ) { writeLen--; }

   //
   // Now byteval[ writeLen ] is either a standalone 1-byte char
   // or the first byte of a multi-byte character. That means that
   // byteval[ writeLen -1 ] is the last (perhaps only) byte of the
   // previous character.
   //
}

private boolean isContinuationChar( byte b )
{    
    return ( (b & MULTI_BYTE_MASK) == CONTINUATION_BYTE );
}

> org.apache.derby.impl.drda.DDMWriter uses wrong algorithm to avoid spliting varchar in
the middle of a multibyte char.
> ----------------------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-2694
>                 URL: https://issues.apache.org/jira/browse/DERBY-2694
>             Project: Derby
>          Issue Type: Bug
>          Components: Network Server
>         Environment: all
>            Reporter: Anurag Shekhar
>            Assignee: Anurag Shekhar
>             Fix For: 10.3.0.0
>
>         Attachments: derby-2694-v2.diff, derby-2694.diff, TestProc.java, TestProc_TruncateRep.java
>
>
> org.apache.derby.impl.drda.DDMWriter uses wrong algorithm to avoid splitting varchar
in the middle of a multibyte char.
> When DMWriter finds that it has to split a varchar while sending it to client it checks
if the last byte is a part of a multibyte char and in case it is it tries to find the last
byte of previous char and sends only till that byte leaving rest of it for the next send.
> The code it uses is having a bug so it fails when the last byte its checking for is the
third byte of a char of 3 byte length.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message