Return-Path: Mailing-List: contact lucene-dev-help@jakarta.apache.org; run by ezmlm Delivered-To: mailing list lucene-dev@jakarta.apache.org Received: (qmail 47209 invoked from network); 23 Sep 2003 19:48:47 -0000 Received: from unknown (HELO mailshell.com) (209.157.66.249) by daedalus.apache.org with SMTP; 23 Sep 2003 19:48:47 -0000 Received: (qmail 17563 invoked from network); 23 Sep 2003 19:13:49 -0000 Received: from unknown (HELO lucene.com) (dcutting@grandcentral.com@12.210.200.74) by mail.mailshell.com with SMTP; 23 Sep 2003 19:13:49 -0000 Message-ID: <3F709B63.1000507@lucene.com> Date: Tue, 23 Sep 2003 12:13:39 -0700 From: Doug Cutting User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030701 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Lucene Developers List Subject: Re: idea for reducing file handle use References: <30C0A13A-EA04-11D7-90CA-000393A564E6@ehatchersolutions.com> <3F6A01A8.6060604@earthlink.net> <3F6A14EB.40100@lucene.com> <3F6A2FB3.60105@earthlink.net> In-Reply-To: <3F6A2FB3.60105@earthlink.net> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N Dmitry Serebrennikov wrote: >> It would be cleaner if this could be done entirely as a Directory >> implementation. I know some folks who've implemented a >> filesystem-within-a-file solution for this problem that they're very >> happy with. It is a Directory, and requires no changes to Lucene. >> I'll ask them if they're willing to contribute it, so that others can >> use it. > > That would be great! I thought about doing this at the directory level, > but I wasn't sure how to handle re-writing of multi-file segments into > single-file segments once the IndexWriter is closed. I'd like to avoid > dealing with file fragmentation, so I like having multiple files during > IndexWriter operation. But if there is a ready-made solution that does > deal with fragmentation, I guess that would work just fine. Unfortunately their solution does not deal very well with fragmentation. It could be improved to do so, but it does not at present. When allocating blocks to a file it could look for nearby blocks, and it could also be made to support an explicit defragmenting operation. Doug