harmony-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mikhail Markov (JIRA)" <j...@apache.org>
Subject [jira] Created: (HARMONY-4117) [classlib][nio_char] UTF-8 CharsetDecoder.decode() method incorrectly works in sequence of invocations
Date Sat, 09 Jun 2007 12:15:25 GMT
[classlib][nio_char] UTF-8 CharsetDecoder.decode() method incorrectly works in sequence of
invocations
------------------------------------------------------------------------------------------------------

                 Key: HARMONY-4117
                 URL: https://issues.apache.org/jira/browse/HARMONY-4117
             Project: Harmony
          Issue Type: Bug
          Components: Classlib
         Environment: WinXP
            Reporter: Mikhail Markov


CharsetDecoder.decode() method for UTF-8 charset incorrectly works for a sequence of invocations.
Here is the test demonstrating the problem:

----------- Test.java -----------
import java.nio.charset.*;
import java.nio.*;

public class Test {
    public static void main(String[] args) {
        byte[] buf = new byte[] { -30, -117, -109, -30, -117 };
        byte[] buf1 = new byte[] { -30, -117, -108, -30, -117 };
        ByteBuffer bbuf;
        CharsetDecoder dec = Charset.forName("UTF-8").newDecoder().onMalformedInput(
                CodingErrorAction.REPLACE).onUnmappableCharacter(
                CodingErrorAction.REPLACE);
        CharBuffer cbuf = CharBuffer.allocate(10);

        // decoding 1-st sequence, 2 bytes should left un-encoded
        bbuf = ByteBuffer.wrap(buf);
        dec.decode(bbuf, cbuf, false);
        System.out.println("bbuf = " + bbuf);

        // dec.reset();

        // decoding 2-nd sequence, 2 bytes should left un-encoded
        bbuf = ByteBuffer.wrap(buf1);
        dec.decode(bbuf, cbuf, false);
        System.out.println("bbuf = " + bbuf);
    }
}
---------------------------------------

Output on RI:
bbuf = java.nio.HeapByteBuffer[pos=3 lim=5 cap=5]
bbuf = java.nio.HeapByteBuffer[pos=3 lim=5 cap=5]

Output on Harmony:
bbuf = java.nio.ReadWriteHeapByteBuffer, status: capacity=5 position=3 limit=5
bbuf = java.nio.ReadWriteHeapByteBuffer, status: capacity=5 position=5 limit=5

So the second decode() call exhausted input ByteBuffer while 2  not-encoded bytes should remains.
If we uncomment // dec.reset(); line, then the output on RI remains the same , and on Harmony
became
bbuf = java.nio.ReadWriteHeapByteBuffer, status: capacity=5 position=3 limit=5
bbuf = java.nio.ReadWriteHeapByteBuffer, status: capacity=5 position=3 limit=5

Seems like after the first decode() method invocation, UTF-8 CharsetDecoder has some internal
data which it uses for next decode() call and if we reset it then this info is deleted. Note
that i could reproduce this only for UTF-8 decoder, for other charsets similar sequences works
fine.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message