Return-Path: X-Original-To: apmail-commons-issues-archive@minotaur.apache.org Delivered-To: apmail-commons-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4D52D18D22 for ; Fri, 15 Jan 2016 07:35:40 +0000 (UTC) Received: (qmail 34820 invoked by uid 500); 15 Jan 2016 07:35:40 -0000 Delivered-To: apmail-commons-issues-archive@commons.apache.org Received: (qmail 34727 invoked by uid 500); 15 Jan 2016 07:35:40 -0000 Mailing-List: contact issues-help@commons.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: issues@commons.apache.org Delivered-To: mailing list issues@commons.apache.org Received: (qmail 34716 invoked by uid 99); 15 Jan 2016 07:35:40 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 15 Jan 2016 07:35:40 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id CA7082C1F5B for ; Fri, 15 Jan 2016 07:35:39 +0000 (UTC) Date: Fri, 15 Jan 2016 07:35:39 +0000 (UTC) From: "Dawid Weiss (JIRA)" To: issues@commons.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (COMPRESS-291) decompress .7z archive very very slow MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/COMPRESS-291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15101375#comment-15101375 ] Dawid Weiss commented on COMPRESS-291: -------------------------------------- No problem at all, Stefan. I dug into the code, it's actually a lot better at explaining what's going on in the format than the "official" specification is ({{7zFormat.txt}})... bq. then "almost random access" to single entries should be possible Yes, you'd basically have to decode "a bit more" if the required encoded file is nested somewhere inside a compressed block. This is not an uncommon thing -- "solid" archives in RAR have this property too. The gain is for lots of small (or very similar) files when the compression dictionary of the encoder is reused for multiple files. Like I said, I'll try to fix it for our own purposes -- I'll provide a patch if I get it working. > decompress .7z archive very very slow > ------------------------------------- > > Key: COMPRESS-291 > URL: https://issues.apache.org/jira/browse/COMPRESS-291 > Project: Commons Compress > Issue Type: Improvement > Components: Compressors > Affects Versions: 1.9 > Environment: Windows 7 x64, jdk1.7.0_21 x64 > Reporter: Robert Jansen > Priority: Minor > > I have 7z archives with one large image and many small files. The following code decompresses to a directory and returns the largest file. It is glacially slow and not usable for GB size files: > public File unSevenZipToDir(File sevenZipFile, File outputDir) { > > File imgFile = null; > // Make sure output dir exists > outputDir.mkdirs(); > if (outputDir.exists()) { > > //FileInputStream stream; > try { > > FileOutputStream output = null; > SevenZFile f7z = new SevenZFile(sevenZipFile); > SevenZArchiveEntry entry; > long maxSize = 0; > while ((entry = f7z.getNextEntry()) != null) { > if (entry != null) { > String s = entry.getName(); > if (s != null) { > long sz = entry.getSize(); > > if (sz > 0) { > int count; > byte data[] = new byte[4096]; > > String outFileName = outputDir.getPath() + "/" > + new File(entry.getName()).getName(); > > > > > > File outFile = new File(outFileName); > > // Extract only if it does not already exist > if (outFile.exists() == false) { > System.out.println("Extracting " + s + " => size = " + sz); > > > > FileOutputStream fos = new FileOutputStream( > outFile); > > BufferedOutputStream dest = new BufferedOutputStream( > fos); > > while ((count = f7z.read(data)) != -1) { > dest.write(data, 0, count); > } > > dest.flush(); > dest.close(); > > } else { > System.out.println("Using already Extracted " + s + " => size = " + sz); > } > if (s.endsWith(".h5") || s.endsWith(".tif") || > s.endsWith(".cos") || s.endsWith(".nitf") > || s.endsWith(".ntf") > || s.endsWith(".jpg") && sz > maxSize) { > maxSize = sz; > imgFile = new File(outFileName); > } > } // end sz > 0 > } // end s != null > } // end if entry > } // end while > f7z.close(); > } catch (FileNotFoundException e) { > // TODO Auto-generated catch block > e.printStackTrace(); > } catch (IOException e) { > // TODO Auto-generated catch block > e.printStackTrace(); > } > } > return imgFile; > } -- This message was sent by Atlassian JIRA (v6.3.4#6332)