Return-Path: Delivered-To: apmail-jakarta-lucene-dev-archive@www.apache.org Received: (qmail 70755 invoked from network); 10 Aug 2004 06:44:52 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur-2.apache.org with SMTP; 10 Aug 2004 06:44:52 -0000 Received: (qmail 74449 invoked by uid 500); 10 Aug 2004 06:44:49 -0000 Delivered-To: apmail-jakarta-lucene-dev-archive@jakarta.apache.org Received: (qmail 74313 invoked by uid 500); 10 Aug 2004 06:44:47 -0000 Mailing-List: contact lucene-dev-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Developers List" Reply-To: "Lucene Developers List" Delivered-To: mailing list lucene-dev@jakarta.apache.org Received: (qmail 74297 invoked by uid 99); 10 Aug 2004 06:44:47 -0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=NO_REAL_NAME,SPF_HELO_PASS X-Spam-Check-By: apache.org Received: from [212.227.126.187] (HELO moutng.kundenserver.de) (212.227.126.187) by apache.org (qpsmtpd/0.27.1) with ESMTP; Mon, 09 Aug 2004 23:44:45 -0700 Received: from [212.227.126.200] (helo=mrvnet.kundenserver.de) by moutng.kundenserver.de with esmtp (Exim 3.35 #1) id 1BuQMc-00086D-00 for lucene-dev@jakarta.apache.org; Tue, 10 Aug 2004 08:44:02 +0200 Received: from [172.23.4.145] (helo=config18.kundenserver.de) by mrvnet.kundenserver.de with esmtp (Exim 3.35 #1) id 1BuQMb-00022q-00 for lucene-dev@jakarta.apache.org; Tue, 10 Aug 2004 08:44:01 +0200 Received: from www-data by config18.kundenserver.de with local (Exim 3.35 #1 (Debian)) id 1BuQMb-0006ru-00 for ; Tue, 10 Aug 2004 08:44:01 +0200 To: =?iso-8859-1?Q?Lucene_Developers_List?= Subject: =?iso-8859-1?Q?Re:_Re:_optimized_disk_usage_when_creating_a_compound_index?= From: Message-Id: <26995588$109211885141186943d4a6c2.64920639@config18.schlund.de> X-Binford: 6100 (more power) X-Originating-From: 26995588 X-Mailer: Webmail X-Routing: DE X-Received: from config18 by 217.249.193.50 with HTTP id 26995588 for lucene-dev@jakarta.apache.org; Tue, 10 Aug 2004 08:42:01 +0200 Content-Type: text/plain; charset="iso-8859-1" Mime-Version: 1.0 Content-Transfer-Encoding: 8bit X-Priority: 3 Date: Tue, 10 Aug 2004 08:42:01 +0200 X-Provags-ID: kundenserver.de abuse@kundenserver.de ident:@172.23.4.145 X-Virus-Checked: Checked X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N Dmitry Serebrennikov schrieb am 09.08.2004, 19:15:20: > Well, I think this could work, but I'm not sure how this will behave if > an IndexReader is created on the new segment while it is still > uncompound. Then when you try to delete the individual files, you'd have > to implement something like "deletable" file for segments (to work with > Windows file locking). That's right. I would use the deletable mechanism in IndexWriter to delete the non-compound files of the index after creation of the compound file. That's step 4 of my last mail. It would be done within a commit lock. > Anyway, what do you think of the original way proposed by Bernard? I > think that method was ok. If I understand correctly, in that method the > merge process does not end until compound file is created (as before), > but the files are deleted as they are merged in. I suppose there is a > chance that the compound file creation process fails and we would not > have any new segment since the files that were useable would have been > half deleted. Is that what's bothering you in this solution? To me this > seems acceptable because it shouldn't happen frequently. What do you > think? Is there anything I'm missing about Bernard's solution? In Bernhard's solution the old segments that have been merged into the new segment are still there while building the compound file. Disk space is saved by deleting the non-compound files of the new segment earlier than in the original implementation, immediately after copying them into the compound file. However, usually there are 1 to 3 big files in a segment. So the advantage in disk space is not as big as it could be with my solution. Individual files still exist in 3 copies for a short period of time (while they are copied). Furthermore, deleting files in CompoundFilerWriter.close is not what I would expect from a CompoundFilerWriter. But since it is only used in SegmentMerger, it's ok. I am not insisting on my solution. I was just about to commit Bernhard's solution on Sunday, but then I thought it could be done better.... Now I am not sure what to do. I am still a little bit in favour of my idea, but not so much.... > (By the way, Thanks for helping to maintain and improve this code!) > Dmitry. I think we are all doing this because it's fun and there is such a great community immediately looking at, testing and reviewing our work. --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-dev-help@jakarta.apache.org