commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Julien Aymé (JIRA) <j...@apache.org>
Subject [jira] [Commented] (COMPRESS-248) Naive OOM when deal with a corrupt .gz file
Date Tue, 10 Dec 2013 07:29:07 GMT

    [ https://issues.apache.org/jira/browse/COMPRESS-248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13844036#comment-13844036
] 

Julien Aymé commented on COMPRESS-248:
--------------------------------------

Hello Jing, thanks for the report.

May I ask you what is thrown when trying to extract the third file using java.util.zip.GZIPInputStream?
{code}
File corruptFile = ...; // The corrupt gz file
InputStream in = new GZIPInputStream(new FileInputStream(corruptFile));
byte[] buf = new byte[4096];
long total = 0;
int read;
while ((read = in.read(buf)) != -1) {
    total += read;
}
{code}

If the same OOME is thrown, then I suggest opening a bug at Oracle (http://bugs.sun.com/bugdatabase/).
If the corrupt file is not sensitive and not too big, could you attach it to the issue ?

Thanks in advance,
regards,
Julien

> Naive OOM when deal with a corrupt .gz file
> -------------------------------------------
>
>                 Key: COMPRESS-248
>                 URL: https://issues.apache.org/jira/browse/COMPRESS-248
>             Project: Commons Compress
>          Issue Type: Bug
>          Components: Compressors
>    Affects Versions: 1.6
>         Environment: Fedora 19 x86_64, 8G RAM, Java version "1.7.0_45"
> OpenJDK Runtime Environment (fedora-2.4.3.0.fc19-x86_64 u45-b15)
> OpenJDK 64-Bit Server VM (build 24.45-b08, mixed mode)
>            Reporter: Jing Li
>
> I tried to extract three gz files, and they are corrupt. The two of them at ahead  throw
the IOExceptions:
> Caused by: java.io.IOException: Gzip-compressed data is corrupt
> 	at org.apache.commons.compress.compressors.gzip.GzipCompressorInputStream.read(GzipCompressorInputStream.java:253)
> 	at java.io.InputStream.read(InputStream.java:82)
> ...
> But when comes to the third one it throw out OOM as below:
> java.lang.OutOfMemoryError
> 	at java.util.zip.Inflater.inflateBytes(Native Method)
> 	at java.util.zip.Inflater.inflate(Inflater.java:238)
> 	at org.apache.commons.compress.compressors.gzip.GzipCompressorInputStream.read(GzipCompressorInputStream.java:251)
> 	at java.io.InputStream.read(InputStream.java:82)
> The third file is corrupt, but Linux recognize it as a compressed gz file.
> More info:
> [jing@localhost logs]$ file stdout.log.txt.gz 
> stdout.log.txt.gz: gzip compressed data, was "stdout.log_backup", from Unix, last modified:
Tue Nov 19 22:53:19 2013
> [jing@localhost logs]$ tar -xvzf stdout.log.txt.gz
> gzip: stdin: invalid compressed data--format violated
> tar: Child returned status 1
> tar: Error is not recoverable: exiting now



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Mime
View raw message