harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Oliver Deakin <oliver.dea...@googlemail.com>
Subject Re: [jira] Resolved: (HARMONY-6290) BufferedReader.readLine() breaks at EBCDIC newline, violating the spec
Date Wed, 05 Aug 2009 16:23:21 GMT
Hi all,

I have been looking at the charset encoder/decoders for ebcdic (IBM1047) 
as a result of HARMONY-6290 and I noticed that the character mappings 
appear to be slightly different to those originally generated by the 
TableGenerator tool contributed as part of HARMONY-3593.

When I run the tool on my local machine using the RI, I get byte 0x15 
(NEL) mapped to 0x0A (unicode LF) and 0x25 (LF) mapped to 0x85 (unicode 
NEL). However the Harmony tables have these values the other way around 
- i.e. byte 0x15 mapped to 0x85 and 0x25 mapped to 0x0A. So it appears 
we currently have a character mapping difference to the RI. I have 
opened [1] for this issue and attached a patch to alter our mapping to 
match the RI.

Before I make the commit, are there any objections/comments on this?


[1] https://issues.apache.org/jira/browse/HARMONY-6294

Oliver Deakin (JIRA) wrote:
>      [ https://issues.apache.org/jira/browse/HARMONY-6290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
> Oliver Deakin resolved HARMONY-6290.
> ------------------------------------
>        Resolution: Fixed
>     Fix Version/s: 5.0M11
>          Assignee: Oliver Deakin
> Fix and test case applied with minor change at repo revision r801230 - please check it
applied as expected.
>> BufferedReader.readLine() breaks at EBCDIC newline, violating the spec
>> ----------------------------------------------------------------------
>>                 Key: HARMONY-6290
>>                 URL: https://issues.apache.org/jira/browse/HARMONY-6290
>>             Project: Harmony
>>          Issue Type: Bug
>>          Components: Classlib
>>         Environment: SVN Revision: 800827
>>            Reporter: Jesse Wilson
>>            Assignee: Oliver Deakin
>>             Fix For: 5.0M11
>>         Attachments: readLine_no_EBCDIC.patch
>>   Original Estimate: 0.33h
>>  Remaining Estimate: 0.33h
>> The spec says that BufferedReader.readLine() considers only "\r", "\n" and "\r\n"
to be line separators. We must not permit additional separator characters. I admit that the
RI's behaviour is surprising, and incompatible with it's own Pattern and Scanner classes.
But this is the specified behaviour; the doc explicitly calls out which character sequences
are used as newlines. It does not permit additional characters to break lines. 
>> For users reading EBCDIC-encoded files, a better practice is to read through the
files using a Scanner. That way, the application will behave the same when executed on either
Harmony or on the RI.
>> #Android

Oliver Deakin
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

View raw message