Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 97199 invoked from network); 27 Apr 2006 17:50:22 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 27 Apr 2006 17:50:22 -0000 Received: (qmail 60811 invoked by uid 500); 27 Apr 2006 17:50:19 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 60767 invoked by uid 500); 27 Apr 2006 17:50:18 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 60755 invoked by uid 99); 27 Apr 2006 17:50:18 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 27 Apr 2006 10:50:18 -0700 X-ASF-Spam-Status: No, hits=3.2 required=10.0 tests=DNS_FROM_RFC_ABUSE,DNS_FROM_RFC_WHOIS,FORGED_YAHOO_RCVD,HTML_10_20,HTML_MESSAGE X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: local policy) Received: from [206.190.38.22] (HELO web50007.mail.yahoo.com) (206.190.38.22) by apache.org (qpsmtpd/0.29) with SMTP; Thu, 27 Apr 2006 10:50:17 -0700 Received: (qmail 48832 invoked by uid 60001); 27 Apr 2006 17:49:54 -0000 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=Message-ID:Date:From:Reply-To:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=eDbqaSEQhYWBiZfxJJRokkVxRx0fJ2XhL9jCFBG0VrWj1JGzSygDT6z9HktlL6fFwkEbz2TkemLtbBR179mr7zzKFkEGe9Jsve3+k8Y0ZPbsiilhAypPe5LrT0Nxm0B/G4gROG6FxQnUw8BU4dj8cP1P7ptRw0XWI9eZIQnCJug= ; Message-ID: <20060427174954.48830.qmail@web50007.mail.yahoo.com> Date: Thu, 27 Apr 2006 10:49:54 -0700 (PDT) From: jason rutherglen Reply-To: jason rutherglen Subject: Re: GData, updateable IndexSearcher To: java-dev@lucene.apache.org In-Reply-To: <444FDB98.3050503@apache.org> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="0-370251103-1146160194=:34357" X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N --0-370251103-1146160194=:34357 Content-Type: text/plain; charset=us-ascii > I think the 'public static IndexReader.reopen(IndexReader old)' method I proposed can easily compare the current list of segments for the directory of old to those that old already has open, and determine which can be reused and which new segments must be opened. This makes sense. Could you describe how the new segments would be known, or where in the code they can be loaded? Where in the design would there need to be synchronization blocks? ----- Original Message ---- From: Doug Cutting To: java-dev@lucene.apache.org Sent: Wednesday, April 26, 2006 1:44:08 PM Subject: Re: GData, updateable IndexSearcher jason rutherglen wrote: > I was thinking you implied that you knew of someone who had customized their own, but it was a closed source solution. And if so then you would know how that project faired. I don't recall the details, but I know folks have discussed this previously, and probably even posted patches, but I don't think any of the patches was ready to commit. > Wouldn't there also need to be a hack on the IndexWriter to keep track of new segments? I think the 'public static IndexReader.reopen(IndexReader old)' method I proposed can easily compare the current list of segments for the directory of old to those that old already has open, and determine which can be reused and which new segments must be opened. Deletions would be a little tricky to track. If a segment has had deletions, then a new SegmentReader could be cloned from the old, sharing everything but the deletions, which could be re-read from disk. This would invalidate cached filters for segments that had deletions. You could even try to figure out what documents have been deleted, then update filters incrementally. That would be fastest, but more complicated. Doug > ----- Original Message ---- > From: Doug Cutting > To: solr-dev@lucene.apache.org > Sent: Wednesday, April 26, 2006 11:27:44 AM > Subject: Re: GData, updateable IndexSearcher > > jason rutherglen wrote: > >>Interesting, does this mean there is a plan for incrementally updateable IndexSearchers to become part of Lucene? > > > In general, there is no plan for Lucene. If someone implements a > generally useful, efficient, feature in a back-compatible, easy to use, > manner, and submits it as a patch, then it becomes a part of Lucene. > That's the way Lucene changes. Since we don't pay anyone, we can't make > plans and assign tasks. So if you're particularly interested in this > feature, you might search the archives to find past efforts, or simply > try to implement it yourself. > > I think a good approach would be to create a new IndexSearcher instance > based on an existing one, that shares IndexReaders. Similarly, one > should be able to create a new IndexReader based on an existing one. > This would be a MultiReader that shares many of the same SegmentReaders. > > Things get a little tricky after this. > > Lucene caches filters based on the IndexReader. So filters would need > to be re-created. Ideally these could be incrementally re-created, but > that might be difficult. What might be simpler would be to use a > MultiSearcher constructed with an IndexSearcher per SegmentReader, > avoiding the use of MultiReader. Then the caches would still work. > This would require making a few things public that are not at present. > Perhaps adding a 'MultiReader.getSubReaders()' method, combined with an > 'static IndexReader.reopen(IndexReader)' method. The latter would > return a new MultiReader that shared SegmentReaders with the old > version. Then one could use getSubReaders() on the new multi reader to > extract the current set to use when constructing a MultiSearcher. > > Another tricky bit is figuring out when to close readers. > > Does this make sense? This discussion should probably move to the > lucene-dev list. > > >>Are there any negatives to updateable IndexSearchers? > > > Not if implemented well! > > Doug > > > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org --0-370251103-1146160194=:34357--