Return-Path: Delivered-To: apmail-jakarta-poi-dev-archive@www.apache.org Received: (qmail 99397 invoked from network); 14 Oct 2003 07:29:05 -0000 Received: from daedalus.apache.org (HELO mail.apache.org) (208.185.179.12) by minotaur-2.apache.org with SMTP; 14 Oct 2003 07:29:05 -0000 Received: (qmail 38323 invoked by uid 500); 14 Oct 2003 07:28:41 -0000 Delivered-To: apmail-jakarta-poi-dev-archive@jakarta.apache.org Received: (qmail 38182 invoked by uid 500); 14 Oct 2003 07:28:41 -0000 Mailing-List: contact poi-dev-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "POI Developers List" Reply-To: "POI Developers List" Delivered-To: mailing list poi-dev@jakarta.apache.org Received: (qmail 38135 invoked from network); 14 Oct 2003 07:28:38 -0000 Received: from unknown (HELO snowball.asc.com.au) (203.103.42.106) by daedalus.apache.org with SMTP; 14 Oct 2003 07:28:38 -0000 Message-ID: <477E84FEC58EC44BB5AA084A18BB11B202B27A01@snowball.asc.com.au> From: "Height, Jason" To: 'POI Developers List' Date: Tue, 14 Oct 2003 16:59:41 +0930 Subject: RE: cvs commit: jakarta-poi/src/java/org/apache/poi/hssf/record S STDeserializer.java SSTRecord.java MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="--=_NextPart_ST_16_59_42_Tuesday_October_14_2003_17078" X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N ----=_NextPart_ST_16_59_42_Tuesday_October_14_2003_17078 Content-Type: text/plain This patch is not quite there yet. It will be shortly. I accidentally posted before running the test cases and the TestSSTRecord.testHugeStrings is failing Jason -----Original Message----- From: jheight@apache.org [mailto:jheight@apache.org] Sent: Tuesday, 14 October 2003 4:48 PM To: jakarta-poi-cvs@apache.org Subject: cvs commit: jakarta-poi/src/java/org/apache/poi/hssf/record SSTDeserializer.java SSTRecord.java jheight 2003/10/14 00:18:17 Modified: src/java/org/apache/poi/hssf/record Tag: REL_2_BRANCH SSTDeserializer.java SSTRecord.java Log: Patch to fix bugs 15556 and 22742. Double byte handeling of SSTDeserializer now works. Revision Changes Path No revision No revision 1.5.2.1 +48 -33 jakarta-poi/src/java/org/apache/poi/hssf/record/SSTDeserializer.java Index: SSTDeserializer.java =================================================================== RCS file: /home/cvs/jakarta-poi/src/java/org/apache/poi/hssf/record/SSTDeserializer.ja va,v retrieving revision 1.5 retrieving revision 1.5.2.1 diff -u -r1.5 -r1.5.2.1 --- SSTDeserializer.java 30 Apr 2003 04:38:48 -0000 1.5 +++ SSTDeserializer.java 14 Oct 2003 07:18:17 -0000 1.5.2.1 @@ -62,13 +62,14 @@ * Handles the task of deserializing a SST string. The two main entry points are * * @author Glen Stampoultzis (glens at apache.org) + * @author Jason Height (jheight at apache.org) */ class SSTDeserializer { private BinaryTree strings; - /** this is the number of characters we expect in the first sub-record in a subsequent continuation record */ - private int continuationExpectedChars; + /** this is the number of characters that have been read prior to the continuation */ + private int continuationReadChars; /** this is the string we were working on before hitting the end of the current record. This string is NOT finished. */ private String unfinishedString; /** this is true if the string uses wide characters */ @@ -82,6 +83,7 @@ /** Number of characters in current string */ private int charCount; private int extensionLength; + private int continueSkipBytes = 0; public SSTDeserializer( BinaryTree strings ) @@ -93,13 +95,14 @@ private void initVars() { runCount = 0; - continuationExpectedChars = 0; + continuationReadChars = 0; unfinishedString = ""; // bytesInCurrentSegment = 0; // stringDataOffset = 0; wideChar = false; richText = false; extendedText = false; + continueSkipBytes = 0; } /** @@ -107,14 +110,15 @@ * strings may span across multiple continuations. Read the SST record * carefully before beginning to hack. */ - public void manufactureStrings( final byte[] data, final int initialOffset, short dataSize ) + public void manufactureStrings( final byte[] data, final int initialOffset) { initVars(); int offset = initialOffset; - while ( ( offset - initialOffset ) < dataSize ) + final int dataSize = data.length; + while ( offset < dataSize ) { - int remaining = dataSize - offset + initialOffset; + int remaining = dataSize - offset; if ( ( remaining > 0 ) && ( remaining < LittleEndianConsts.SHORT_SIZE ) ) { @@ -122,26 +126,31 @@ } if ( remaining == LittleEndianConsts.SHORT_SIZE ) { - setContinuationExpectedChars( LittleEndian.getUShort( data, offset ) ); + //JMH Dont know about this + setContinuationCharsRead( 0 );//LittleEndian.getUShort( data, offset ) ); unfinishedString = ""; break; } charCount = LittleEndian.getUShort( data, offset ); + int charsRead = charCount; readStringHeader( data, offset ); boolean stringContinuesOverContinuation = remaining < totalStringSize(); if ( stringContinuesOverContinuation ) { - int remainingBytes = ( initialOffset + dataSize ) - offset - stringHeaderOverhead(); - setContinuationExpectedChars( charCount - calculateCharCount( remainingBytes ) ); - charCount -= getContinuationExpectedChars(); + int remainingBytes = dataSize - offset - stringHeaderOverhead(); + //Only read the size of the string or whatever is left before the + //continuation + charsRead = Math.min(charsRead, calculateCharCount( remainingBytes )); + setContinuationCharsRead( charsRead ); + if (charsRead == charCount) { + //Since all of the characters will have been read, but the entire string (including formatting runs etc) + //hasnt, Compute the number of bytes to skip when the continue record starts + continueSkipBytes = offsetForContinuedRecord(0) - (remainingBytes - calculateByteCount(charsRead)); + } } - else - { - setContinuationExpectedChars( 0 ); - } - processString( data, offset, charCount ); + processString( data, offset, charsRead ); offset += totalStringSize(); - if ( getContinuationExpectedChars() != 0 ) + if ( stringContinuesOverContinuation ) { break; } @@ -222,6 +231,7 @@ UnicodeString string = new UnicodeString( UnicodeString.sid, (short) unicodeStringBuffer.length, unicodeStringBuffer ); + setContinuationCharsRead( calculateCharCount(bytesRead)); if ( isStringFinished() ) { @@ -238,7 +248,7 @@ private boolean isStringFinished() { - return getContinuationExpectedChars() == 0; + return getContinuationCharsRead() == charCount; } /** @@ -301,8 +311,9 @@ { if ( isStringFinished() ) { + final int offset = continueSkipBytes; initVars(); - manufactureStrings( record, 0, (short) record.length ); + manufactureStrings( record, offset); } else { @@ -330,13 +341,12 @@ */ private void readStringRemainder( final byte[] record ) { - int stringRemainderSizeInBytes = calculateByteCount( getContinuationExpectedChars() ); -// stringDataOffset = LittleEndianConsts.BYTE_SIZE; + int stringRemainderSizeInBytes = calculateByteCount( charCount-getContinuationCharsRead() ); byte[] unicodeStringData = new byte[SSTRecord.STRING_MINIMAL_OVERHEAD - + calculateByteCount( getContinuationExpectedChars() )]; + + stringRemainderSizeInBytes]; // write the string length - LittleEndian.putShort( unicodeStringData, 0, (short) getContinuationExpectedChars() ); + LittleEndian.putShort( unicodeStringData, 0, (short) (charCount-getContinuationCharsRead()) ); // write the options flag unicodeStringData[LittleEndianConsts.SHORT_SIZE] = createOptionByte( wideChar, richText, extendedText ); @@ -345,7 +355,7 @@ // past all the overhead of the str_data array arraycopy( record, LittleEndianConsts.BYTE_SIZE, unicodeStringData, SSTRecord.STRING_MINIMAL_OVERHEAD, - unicodeStringData.length - SSTRecord.STRING_MINIMAL_OVERHEAD ); + stringRemainderSizeInBytes ); // use special constructor to create the final string UnicodeString string = new UnicodeString( UnicodeString.sid, @@ -356,7 +366,7 @@ addToStringTable( strings, integer, string ); int newOffset = offsetForContinuedRecord( stringRemainderSizeInBytes ); - manufactureStrings( record, newOffset, (short) ( record.length - newOffset ) ); + manufactureStrings( record, newOffset); } /** @@ -388,8 +398,12 @@ private int offsetForContinuedRecord( int stringRemainderSizeInBytes ) { - return stringRemainderSizeInBytes + LittleEndianConsts.BYTE_SIZE - + runCount * LittleEndianConsts.INT_SIZE + extensionLength; + int offset = stringRemainderSizeInBytes + runCount * LittleEndianConsts.INT_SIZE + extensionLength; + if (stringRemainderSizeInBytes != 0) + //If a portion of the string remains then the wideChar options byte is repeated, + //so need to skip this. + offset += + LittleEndianConsts.BYTE_SIZE; + return offset; } private byte createOptionByte( boolean wideChar, boolean richText, boolean farEast ) @@ -409,17 +423,18 @@ int dataLengthInBytes = record.length - LittleEndianConsts.BYTE_SIZE; byte[] unicodeStringData = new byte[record.length + LittleEndianConsts.SHORT_SIZE]; - LittleEndian.putShort( unicodeStringData, (byte) 0, (short) calculateCharCount( dataLengthInBytes ) ); + int charsRead = calculateCharCount( dataLengthInBytes ); + LittleEndian.putShort( unicodeStringData, (byte) 0, (short) charsRead ); arraycopy( record, 0, unicodeStringData, LittleEndianConsts.SHORT_SIZE, record.length ); UnicodeString ucs = new UnicodeString( UnicodeString.sid, (short) unicodeStringData.length, unicodeStringData ); unfinishedString = unfinishedString + ucs.getString(); - setContinuationExpectedChars( getContinuationExpectedChars() - calculateCharCount( dataLengthInBytes ) ); + setContinuationCharsRead( charsRead ); } private boolean stringSpansContinuation( int continuationSizeInBytes ) { - return calculateByteCount( getContinuationExpectedChars() ) > continuationSizeInBytes; + return calculateByteCount( charCount - getContinuationCharsRead() ) > continuationSizeInBytes; } /** @@ -427,14 +442,14 @@ * sub-record in a subsequent continuation record */ - int getContinuationExpectedChars() + int getContinuationCharsRead() { - return continuationExpectedChars; + return continuationReadChars; } - private void setContinuationExpectedChars( final int count ) + private void setContinuationCharsRead( final int count ) { - continuationExpectedChars = count; + continuationReadChars = count; } private int calculateByteCount( final int character_count ) 1.7.2.4 +1 -1 jakarta-poi/src/java/org/apache/poi/hssf/record/SSTRecord.java Index: SSTRecord.java =================================================================== RCS file: /home/cvs/jakarta-poi/src/java/org/apache/poi/hssf/record/SSTRecord.java,v retrieving revision 1.7.2.3 retrieving revision 1.7.2.4 diff -u -r1.7.2.3 -r1.7.2.4 --- SSTRecord.java 25 Sep 2003 08:08:05 -0000 1.7.2.3 +++ SSTRecord.java 14 Oct 2003 07:18:17 -0000 1.7.2.4 @@ -482,7 +482,7 @@ field_2_num_unique_strings = LittleEndian.getInt( data, 4 + offset ); field_3_strings = new BinaryTree(); deserializer = new SSTDeserializer(field_3_strings); - deserializer.manufactureStrings( data, 8 + offset, (short)(size - 8) ); + deserializer.manufactureStrings( data, 8 + offset); } --------------------------------------------------------------------- To unsubscribe, e-mail: poi-dev-unsubscribe@jakarta.apache.org For additional commands, e-mail: poi-dev-help@jakarta.apache.org -------------------------------------------------------------------------------------------------------------------- This e-mail (including attachments) is confidential information of Australian Submarine Corporation Pty Limited (ASC). It may also be legally privileged. Unauthorised use and disclosure is prohibited. ASC is not taken to have waived confidentiality or privilege if this e-mail was sent to you in error. If you have received it in error, please notify the sender promptly. While ASC takes steps to identify and eliminate viruses, it cannot confirm that this e-mail is free from them. You should scan this e-mail for viruses before it is used. The statements in this e-mail are those of the sender only, unless specifically stated to be those of ASC by someone with authority to do so. ----=_NextPart_ST_16_59_42_Tuesday_October_14_2003_17078--