Return-Path: X-Original-To: apmail-commons-issues-archive@minotaur.apache.org Delivered-To: apmail-commons-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id CAE5C69AB for ; Sun, 24 Jul 2011 00:01:35 +0000 (UTC) Received: (qmail 68832 invoked by uid 500); 24 Jul 2011 00:01:35 -0000 Delivered-To: apmail-commons-issues-archive@commons.apache.org Received: (qmail 68726 invoked by uid 500); 24 Jul 2011 00:01:34 -0000 Mailing-List: contact issues-help@commons.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: issues@commons.apache.org Delivered-To: mailing list issues@commons.apache.org Received: (qmail 68716 invoked by uid 99); 24 Jul 2011 00:01:34 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 24 Jul 2011 00:01:34 +0000 X-ASF-Spam-Status: No, hits=-2001.1 required=5.0 tests=ALL_TRUSTED,RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 24 Jul 2011 00:01:31 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id CC2C6591CA for ; Sun, 24 Jul 2011 00:01:09 +0000 (UTC) Date: Sun, 24 Jul 2011 00:01:09 +0000 (UTC) From: "Dmitriy Smirnov (JIRA)" To: issues@commons.apache.org Message-ID: <2068508175.1551.1311465669833.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Created] (COMPRESS-146) BZip2CompressorInputStream always treats 0x177245385090 as EOF, but should treat this as EOS MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org BZip2CompressorInputStream always treats 0x177245385090 as EOF, but should treat this as EOS -------------------------------------------------------------------------------------------- Key: COMPRESS-146 URL: https://issues.apache.org/jira/browse/COMPRESS-146 Project: Commons Compress Issue Type: Bug Components: Compressors Environment: all Reporter: Dmitriy Smirnov Priority: Critical BZip2CompressorInputStream always treats 0x177245385090 as EOF, but should treat this as EOS This error occurs mostly on large size files as sudden EOF somwere in the middle of the file. An example of data from archived file: $ cat fastq.ax.bz2 | od -t x1 | grep -A 1 '17 72 45' 22711660 d0 ff b6 01 20 10 ff ff 17 72 45 38 50 90 2e ff 22711700 b2 d3 42 5a 68 39 31 41 59 26 53 59 84 3c 41 75 -- 24637020 c5 49 ff 19 80 49 20 7f ff 17 72 45 38 50 90 a4 24637040 a8 ac bd 42 5a 68 39 31 41 59 26 53 59 0d 9a b4 -- 40302720 ff b1 24 80 10 ff ff 17 72 45 38 50 90 24 cb c5 40302740 90 42 5a 68 39 31 41 59 26 53 59 42 05 ae 5e 05 ..... Suggested solution: private void initBlock() throws IOException { char magic0 = bsGetUByte(); char magic1 = bsGetUByte(); char magic2 = bsGetUByte(); char magic3 = bsGetUByte(); char magic4 = bsGetUByte(); char magic5 = bsGetUByte(); if( magic0 == 0x17 && magic1 == 0x72 && magic2 == 0x45 && magic3 == 0x38 && magic4 == 0x50 && magic5 == 0x90 ) { if( complete() ) // end of file); { return; } else { magic0 = bsGetUByte(); magic1 = bsGetUByte(); magic2 = bsGetUByte(); magic3 = bsGetUByte(); magic4 = bsGetUByte(); magic5 = bsGetUByte(); } } if (magic0 != 0x31 || // '1' magic1 != 0x41 || // 'A' magic2 != 0x59 || // 'Y' magic3 != 0x26 || // '&' magic4 != 0x53 || // 'S' magic5 != 0x59 // 'Y' ) { this.currentState = EOF; throw new IOException("bad block header"); } else { this.storedBlockCRC = bsGetInt(); this.blockRandomised = bsR(1) == 1; /** * Allocate data here instead in constructor, so we do not allocate * it if the input file is empty. */ if (this.data == null) { this.data = new Data(this.blockSize100k); } // currBlockNo++; getAndMoveToFrontDecode(); this.crc.initialiseCRC(); this.currentState = START_BLOCK_STATE; } } private boolean complete() throws IOException { boolean result = false; this.storedCombinedCRC = bsGetInt(); try { if (in.available() == 0 ) { throw new IOException( "EOF" ); } checkMagicChar('B', "first"); checkMagicChar('Z', "second"); checkMagicChar('h', "third"); int blockSize = this.in.read(); if ((blockSize < '1') || (blockSize > '9')) { throw new IOException("Stream is not BZip2 formatted: illegal " + "blocksize " + (char) blockSize); } this.blockSize100k = blockSize - '0'; this.bsLive = 0; this.bsBuff = 0; } catch( IOException e ) { this.currentState = EOF; result = true; } this.data = null; if (this.storedCombinedCRC != this.computedCombinedCRC) { throw new IOException("BZip2 CRC error"); } this.computedCombinedCRC = 0; return result; } -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira