harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mikhail Markov" <mikhail.a.mar...@gmail.com>
Subject Re: [nio_char][drlvm] Strange effect on DRLVM with charsets
Date Tue, 22 May 2007 07:07:58 GMT
I've further investigated the problem and found that it's reproducible on
IBM VM as well, but to reproduce it "-Dfile.encoding=ISO-8859-1" option
should be added. Indeed, lineSeparator field in java.io.PrintStream is
changed:

import java.nio.charset.Charset;
import java.lang.reflect.Field;
import java.io.PrintStream;

public class Test {
    public static void main(String[] args) throws Exception {
        Field f = PrintStream.class.getDeclaredField("lineSeparator");
        f.setAccessible(true);
        System.out.println("separator[0] before encoding: " + ((String)
f.get(System.out)).getBytes()[0]);
        Charset charset = Charset.forName("ISO-8859-1");
        charset.encode("\u3400");
        System.out.println("separator[0] after encoding: " + ((String) f.get
(System.out)).getBytes()[0]);
    }
}

Output on J9:
separator[0] before encoding: 13
separator[0] after encoding: 26⌂

Output on DRLVM:
separator[0] before encoding: 13
separator[0] after encoding: 26→

The problem that '\u3400' is encoded differently in RI/Harmony is described
in http://issues.apache.org/jira/browse/HARMONY-3307, but the problem with
changing lineSeparator is new and separate.

Thanks,
Mikhail

On 5/15/07, Mikhail Markov <mikhail.a.markov@gmail.com> wrote:
>
> Hi!
>
> While investigating H-3307 I've found a strange effect on DRLVM. The
> following code:
> import java.nio.charset.Charset;
>
> public class Test {
>     public static void main(String[] args) {
>         System.out.println("print something...");
>         Charset charset = Charset.forName("ISO-8859-1");
>         charset.encode("\u3400");
>         System.out.println("print something again...");
>         System.out.println("and again...");
>     }
> }
>
> prints additional symbols after charset.encode() line at the end of
> messages in println():
> print something...
> print something again...→
> and again...→
>
> If i remove charset.encode() line then the output is ok:
> print something...
> print something again...
> and again...
>
> Another strange thing that if i remove first println line in the code
> above, the last 2 println works ok, i.e. without any additional symbols
>
> This effect is only reproducible on DRLVM. I'm not quite understand what
> happens here.
>
> Any thoughts?
>
> Thanks,
> Mikhail
>
>
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message