poi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew C. Oliver" <acoli...@apache.org>
Subject Re: sheet names and string format read garbled on EBCDIC machine
Date Thu, 03 Apr 2003 18:30:07 GMT
Quite possibly.  Good point.  Perhaps you can work with some of the 
Japanese folks on the list in order to create appropriate patches/unit 
tests.

Remember, its not only the right encoding thats at work, but what Excel 
will accept..

Elvira Gurevich wrote:

>Sure, I'll do that.
>But would not it be a problem if the original excel file was created on
>a Japanese version of Windows? With a Japanese worksheet name?
>
>-----Original Message-----
>From: Andrew C. Oliver [mailto:acoliver@apache.org] 
>Sent: Wednesday, April 02, 2003 5:21 PM
>To: POI Users List; POI Developers List
>Subject: Re: sheet names and string format read garbled on EBCDIC
>machine
>
>No, it should universally be ISO-8859-1 (Latin1).
>
>Submit a patch and make the unit tests pass (as well as any relevent 
>unit tests)
>
>-Andy
>
>Elvira Gurevich wrote:
>
>  
>
>>In org.apache.poi.hssf.record.BoundSheetRecord, around line 143,
>>The original code was:
>>
>>       if ( ( field_4_compressed_unicode_flag & 0x01 ) == 1 )
>>       {
>>           field_5_sheetname = StringUtil.getFromUnicodeHigh( data, 8
>>    
>>
>+
>  
>
>>offset, nameLength );
>>       }
>>       else
>>       {
>>		field_5_sheetname = new String( data, 8 + offset,
>>nameLength);
>>       }
>>
>>As you can see, if the flag is not on, the String is constructed using
>>the native machine encoding. I am not familiar with the record
>>    
>>
>structure
>  
>
>>and what versions of EXCEL will have this flag on. In any case, in my
>>scenario, the excel file was created on and ASCII machine, poi was run
>>on an EBCDIC machine, creating a string in native EBCDIC which of
>>    
>>
>course
>  
>
>>resulted in garbage. Giving the String constructor an encoding fixed
>>this particular scenario, as following:
>>
>>
>>       if ( ( field_4_compressed_unicode_flag & 0x01 ) == 1 )
>>       {
>>           field_5_sheetname = StringUtil.getFromUnicodeHigh( data, 8
>>    
>>
>+
>  
>
>>offset, nameLength );
>>       }
>>       else
>>       {
>>	    try
>>	    {
>>		field_5_sheetname = new String( data, 8 + offset,
>>nameLength, "UTF-8");
>>	    }
>>	    catch(java.io.UnsupportedEncodingException e)
>>	    {
>>		throw new RecordFormatException( "Unsupported Encoding
>>UTF-8" );
>>	    }
>>       }
>>
>>If you tell me that UTF-8 is not the right encoding to use, I agree. Is
>>there a universal encoding to plug into this constructor for this case?
>>Probably not. The solution to me would be setting an encoding into a
>>workbook through a new HSSFWorkbook(String encoding) constructor (or a
>>setEncoding() method) which would make this field available to a lower
>>level class, like org.apache.poi.hssf.record.BoundSheetRecord.
>>
>>For that matter, org.apache.poi.hssf.record.FormatRecord has the same
>>problem around line 133.
>>
>>
>>Elvira.
>>
>>
>>
>>-----Original Message-----
>>From: Elvira Gurevich [mailto:Elvira_Gurevich@ibi.com] 
>>Sent: Monday, March 24, 2003 10:12 AM
>>To: 'POI Users List'
>>Subject: RE: sheet names and string format read garbled on EBCDIC
>>machine
>>
>>Apparently Excel 2000 uses Unicode internally for all strings. All the
>>cell content strings are read correctly. I converted the sheet name
>>string that came from wb.getSheetName(sheet) call to bytes (as in
>>sheetName.getBytes()) and traced that. The byte code correspond to
>>    
>>
>ASCII
>  
>
>>characters. Which tells me that whoever reads the string, reads it in
>>the default machine encoding, but the string is already in Unicode.  
>>
>>-----Original Message-----
>>From: Joshua Davis [mailto:joshua.davis@kiodex.com] 
>>Sent: Thursday, March 20, 2003 6:40 AM
>>To: 'POI Users List'
>>Subject: RE: sheet names and string format read garbled on EBCDIC
>>machine
>>
>>Elvira,
>>
>>Wulf is right, as this is an odd use case. I'm guessing you are using
>>Java
>>on a mainframe, and hence your need for EBCDIC support.  Maybe you
>>    
>>
>could
>  
>
>>write an EBCDIC->ASCII stream filter and contribute it?
>>
>>BTW, I used to work for IBI... My group had to write an EBCDIC->ASCII
>>filter
>>so that we could make use of some third party libraries.  Mainframes
>>    
>>
>are
>  
>
>>a
>>giant PITA.
>>
>>-----Original Message-----
>>From: Wulf Wechsung [mailto:ww@contexo.de] 
>>Sent: Thursday, March 20, 2003 6:29 AM
>>To: POI Users List
>>Subject: AW: sheet names and string format read garbled on EBCDIC
>>machine
>>
>>
>>
>>what he is trying to say, I think is this: Support is what you are
>>    
>>
>*not*
>  
>
>>paying for, hence it comes down to what little or much people around
>>here
>>are willing and able to provide. C'mmon, you got java developers (or
>>even
>>are one) I am sure, just fix it yourself. PIO should have saved you
>>enough
>>time to do it.
>>
>>-----Ursprungliche Nachricht-----
>>Von: Elvira Gurevich [mailto:Elvira_Gurevich@ibi.com]
>>Gesendet: Mittwoch, 19. Marz 2003 23:25
>>An: 'POI Users List'
>>Betreff: RE: sheet names and string format read garbled on EBCDIC
>>machine
>>
>>
>>This is not a murmur. This is a specific problem and I tried to provide
>>as
>>much info on it as I could. If you need more info, I would be happy to
>>get
>>it to you. It just seems that nobody even looked at the problem so far.
>>It
>>could be a simple fix for someone familiar with the code...
>>
>>If you really need a test setup and will use it, please provide me with
>>more
>>details. I am not a decision maker, but will have to present this to
>>people.
>>
>>Thanks,
>>Elvira.
>>
>>
>>-----Original Message-----
>>From: Andrew C. Oliver [mailto:acoliver@apache.org] 
>>Sent: Wednesday, March 19, 2003 1:06 PM
>>To: POI Users List
>>Subject: Re: sheet names and string format read garbled on EBCDIC
>>machine
>>
>>Elvira_Gurevich@iwaysoftware.com wrote:
>>
>> 
>>
>>    
>>
>>>Hello,
>>>
>>>On  3/7/03, I submitted a bug#17791.
>>>To that bug report, I attached the relevant excel files and BiffViewer
>>>      
>>>
>
>  
>
>>>traces.
>>>
>>>We re-ran the test with jakarta-poi-1.11.0-dev-20030317.jar, with no
>>>   
>>>
>>>      
>>>
>>changes
>> 
>>
>>    
>>
>>>in results.
>>>
>>>We can provide an EBCDIC system for your testing, if that's the reason
>>>   
>>>
>>>      
>>>
>>for
>> 
>>
>>    
>>
>>>the problem to have been neglected so far.
>>>
>>>
>>>   
>>>
>>>      
>>>
>>That would help if you're willing to set up an automated process to run
>>    
>>
>
>  
>
>>our unit tests and send back info to our mailing list.  I've asked for 
>>this for awhile and I've only heard murmurs of volunteers and no 
>>takers.  We could really use automated testing on systems other than 
>>Windows and Linux.  I know a number of people who use POI on solaris 
>>without trouble but I've heared murmurs about problems on various 
>>Mainframes and minis.
>>
>> 
>>
>>    
>>
>>>I really need this resolved.
>>>
>>>   
>>>
>>>      
>>>
>>The best way to get things done urgently and for "free" is to provide 
>>patches which resolve the issue.   Since the project is run on a 
>>volunteer basis, things get fixed "when we have time". 
>>
>>Currently, when not treking around the country, I myself have been 
>>working on a number of client-funded POI projects and haven't had time 
>>to work on much else. 
>>
>>Thanks,
>>
>>-Andy
>>
>> 
>>
>>    
>>
>>>Thank you.
>>>Elvira Gurevich
>>>iWay Software
>>>
>>>
>>>
>>>
>>>---------------------------------------------------------------------
>>>To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org
>>>For additional commands, e-mail: poi-user-help@jakarta.apache.org
>>>
>>>
>>>
>>>
>>>   
>>>
>>>      
>>>
>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org
>>For additional commands, e-mail: poi-user-help@jakarta.apache.org
>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org
>>For additional commands, e-mail: poi-user-help@jakarta.apache.org
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org
>>For additional commands, e-mail: poi-user-help@jakarta.apache.org
>>
>>Disclaimer: This e-mail may contain confidential and privileged
>>    
>>
>material
>  
>
>>for
>>the sole use of the intended recipient(s).  If you are not the intended
>>recipient (or authorized to receive this e-mail for the recipient),
>>please
>>note that review, use, distribution or disclosure of any part of this
>>e-mail
>>is strictly prohibited, except that you should please contact the
>>    
>>
>sender
>  
>
>>or
>>notify Kiodex, Inc. at notices@kiodex.com that you have received this
>>message in error, and delete all copies of the message.  This e-mail
>>    
>>
>and
>  
>
>>any
>>attachments hereto are the property of Kiodex, Inc. and/or its relevant
>>affiliate, and are not intended to be an offer or an acceptance, and do
>>not
>>create or evidence a binding and enforceable contract between Kiodex,
>>Inc.
>>or any of its affiliates and the intended recipient or any other party,
>>and
>>may not be relied on by anyone as the basis of a contract by estoppel
>>    
>>
>or
>  
>
>>otherwise.
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org
>>For additional commands, e-mail: poi-user-help@jakarta.apache.org
>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org
>>For additional commands, e-mail: poi-user-help@jakarta.apache.org
>>
>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org
>>For additional commands, e-mail: poi-user-help@jakarta.apache.org
>>
>>
>> 
>>
>>    
>>
>
>
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org
>For additional commands, e-mail: poi-user-help@jakarta.apache.org
>
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: poi-dev-unsubscribe@jakarta.apache.org
>For additional commands, e-mail: poi-dev-help@jakarta.apache.org
>
>
>  
>




Mime
View raw message