Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 24737 invoked from network); 15 Apr 2008 05:24:50 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 15 Apr 2008 05:24:50 -0000 Received: (qmail 64182 invoked by uid 500); 15 Apr 2008 05:24:44 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 64152 invoked by uid 500); 15 Apr 2008 05:24:44 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 64141 invoked by uid 99); 15 Apr 2008 05:24:44 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 14 Apr 2008 22:24:44 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of adb@teamware.com designates 212.226.92.15 as permitted sender) Received: from [212.226.92.15] (HELO monkey.teamware.com) (212.226.92.15) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 15 Apr 2008 05:24:02 +0000 Received: from intrepid.teamw.com (intrepid.teamw.com [10.142.128.11]) by monkey.teamware.com (8.13.1/8.13.1) with ESMTP id m3F5O5IU028521 for ; Tue, 15 Apr 2008 08:24:05 +0300 Received: from [10.142.3.10] ([10.142.3.10]) by nimitz.teamw.com with ESMTP id m4f8n1mp; 15 Apr 2008 08:23:00 +0300 Message-ID: <48043BD9.3080009@teamware.com> Date: Tue, 15 Apr 2008 15:23:37 +1000 From: Antony Bowesman Organization: Teamware Group User-Agent: Thunderbird 2.0.0.12 (Windows/20080213) MIME-Version: 1.0 To: java-user@lucene.apache.org Subject: Re: Using Lucene partly as DB and 'joining' search results. References: <47FF3B35.4000705@teamware.com> <48042808.1010600@teamware.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-1.6 (monkey.teamware.com [212.226.92.15]); Tue, 15 Apr 2008 08:24:05 +0300 (EEST) X-TWG-MailScanner-Information: See www.mailscanner.info for information X-TWG-MailScanner: Found to be clean X-TWG-MailScanner-SpamCheck: not spam, SpamAssassin (score=0.001, required 5, autolearn=not spam, BAYES_50 0.00) X-MailScanner-From: adb@teamware.com X-Virus-Checked: Checked by ClamAV on apache.org Chris Hostetter wrote: > you can't ... that's why i said you'd need to rebuild the smaller index > completley on a periodic basis (going in the same order as the docs in the Mmm, the annotations would only be stored in the index. It would be possible to store them elsewhere, so I can investigate that, in which case the rebuild would be possible. > i can also imagine a situation where you break both indexes up into lots > of pieces (shards) and use a MultiReader over lots of ParallelReaders ... > that way you have much smaller "small" indexes to rebuild when someone > annotates an email -- and if hte shards are organized by date, you're less > likely to ever need to rebuild many of them since people will tend to Data will be 'sharded' anyway, by date of some granularity. Looking at the source for MultiReader/MultiSearcher, they are single threaded. Is there a performance trade off between single-thread/many small indexes and single-thread/some large indexes. Can a MultiReader work with one..n reader per thread, something like a thread pool of IndexReaders. I expect it would be faster to run the searches in parallel? > Disclaimer: all of this is purely brainstorming, i've never actually tried > anything like this, it may be more trouble then it's worth. :) Thanks for the sounding board - it's always useful to get new ideas! Antony --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org