Return-Path: X-Original-To: apmail-commons-dev-archive@www.apache.org Delivered-To: apmail-commons-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2B1B16057 for ; Tue, 2 Aug 2011 13:14:50 +0000 (UTC) Received: (qmail 37221 invoked by uid 500); 2 Aug 2011 13:14:49 -0000 Delivered-To: apmail-commons-dev-archive@commons.apache.org Received: (qmail 37098 invoked by uid 500); 2 Aug 2011 13:14:49 -0000 Mailing-List: contact dev-help@commons.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: "Commons Developers List" Delivered-To: mailing list dev@commons.apache.org Received: (qmail 37090 invoked by uid 99); 2 Aug 2011 13:14:48 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 02 Aug 2011 13:14:48 +0000 X-ASF-Spam-Status: No, hits=0.7 required=5.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [88.84.128.168] (HELO samaflost.de) (88.84.128.168) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 02 Aug 2011 13:14:41 +0000 Received: from localhost (localhost.localdomain [127.0.0.1]) by samaflost.de (Postfix) with ESMTP id CA747289801D for ; Tue, 2 Aug 2011 15:14:20 +0200 (CEST) Received: from samaflost.de ([127.0.0.1]) by localhost (v35516.1blu.de [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id qCaGwyY9oY79 for ; Tue, 2 Aug 2011 15:14:19 +0200 (CEST) Received: by samaflost.de (Postfix, from userid 1000) id 8741C289801E; Tue, 2 Aug 2011 15:14:19 +0200 (CEST) From: Stefan Bodewig To: dev@commons.apache.org Subject: [compress] Deflater#getBytesRead and friends X-Draft-From: ("nnfolder:mail.jakarta-lib") Date: Tue, 02 Aug 2011 15:14:19 +0200 Message-ID: <87fwlk9cr8.fsf@v35516.1blu.de> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Virus-Checked: Checked by ClamAV on apache.org Hi, one of the main drivers for switching to Java5 was that the methods you use to determine the compressed and uncompressed sizes of data for ZIP archives used to return ints in Java 1.4 and new methods have been added that return longs. I've committed an ignored unit test inside the ZIP package (DeflaterInflaterTest) that shows that those methods really only return unsigned ints and not longs for any JDK < 7 that I have tested so far. The tests compress 4GByte + 4KByte of data and the methods return 4KByte when asked how many bytes they have seen. ZipArchiveOutputStream has already been changed to count the bytes itself and not rely on Deflater, but only for the uncompressed size. I'm afraid the same "unsigned int" behavior applies to the methods returning the compressed sizes as well but I don't have the patience to wait for Deflater to eat up enough random data so that the compressed result finally exceeds 4GByte (the existing test case already takes four minutes on my personal notebook - less than two at work, time to invest, maybe). ZipArchiveOutputStream can intercept the stream and simply count how many bytes have been written to determine the compressed size, but ZipArchiveInputStream is a different beast. It may be a useful heuristic to assume that the result is correct modulo 2^32. ZipArchiveInputStream knows how many bytes it has read but it might have read more than it needed to and has to push back the excess bytes when decompressing a file. It knows the compressed size must be between the number of bytes read and the number of bytes read before the last read operation so the offset in multiples of 4GByte that is missing for the remainder could be determined. For this heuristic to work we'd need to be sure the value returned by Inflater is either correct or correct modulo 2^32 and I'd ask anybody with a more exotic Java impl than I have used (OpenJDK on Linux, Sun/Oracle versions of Java5/6/7 on Win7) to remove the @Ignore from the test case and run it. It should either pass or return something like Failed tests: deflaterBytesRead(org.apache.commons.compress.archivers.zip.DeflaterInflaterTest): expected:<4294971392> but was:<4096> inflaterBytesWritten(org.apache.commons.compress.archivers.zip.DeflaterInflaterTest): expected:<4294971392> but was:<4096> Stefan --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org For additional commands, e-mail: dev-help@commons.apache.org