Return-Path: Delivered-To: apmail-lucene-solr-user-archive@locus.apache.org Received: (qmail 56478 invoked from network); 4 Aug 2006 07:34:43 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 4 Aug 2006 07:34:43 -0000 Received: (qmail 78921 invoked by uid 500); 4 Aug 2006 07:34:43 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 78902 invoked by uid 500); 4 Aug 2006 07:34:43 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 78892 invoked by uid 99); 4 Aug 2006 07:34:43 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 04 Aug 2006 00:34:43 -0700 X-ASF-Spam-Status: No, hits=1.9 required=10.0 tests=DNS_FROM_RFC_ABUSE,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (asf.osuosl.org: 81.228.11.159 is neither permitted nor denied by domain of karl.wettin@gmail.com) Received: from [81.228.11.159] (HELO pne-smtpout2-sn1.fre.skanova.net) (81.228.11.159) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 04 Aug 2006 00:34:42 -0700 Received: from [192.168.0.102] (83.249.40.19) by pne-smtpout2-sn1.fre.skanova.net (7.2.075) id 44A135F10075A0B2 for solr-user@lucene.apache.org; Fri, 4 Aug 2006 09:34:21 +0200 Subject: Re: a thought on cache From: karl wettin To: solr-user@lucene.apache.org In-Reply-To: References: <1154673229.5704.149.camel@localhost> Content-Type: text/plain Organization: snigel heavy industries Date: Fri, 04 Aug 2006 09:34:04 +0200 Message-Id: <1154676845.5704.164.camel@localhost> Mime-Version: 1.0 X-Mailer: Evolution 2.6.1 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N On Thu, 2006-08-03 at 23:53 -0700, Chris Hostetter wrote: > 1) as new docs come in, add them to a purely in memory index > 2) when it becomes time to "commit" the new documents, test all queries > in the cache against this in memory index. > 3) any query in the cache which has a hit on this in memory index should > be invalidated, any query which does not have a hit is still valid. You got it. > ...this could probably work if the index was purely additive > check if one of the cached queries matched on the deleted document Hmm, didn't see that one coming. Quick and dirt would be to rebuild the document for original source. Have to think of a better solution than that though. > the next segment merge could collapse doc ids above deleted docs which > were totally unrelated to any docs that were added or deleted -- so > you would think they are still valid even though the doc ids in the > cache don't correspond to the same documents anymore. This is not the first time I think of low level hooks in the index. If an optimization could report changes this would not be a problem, or? > while the "old" IndexSearcher is still being used by external requests > (and still using it's cache) a new "on deck" IndexSearcher is opened, > and an internal thread is running queries against it (the results of I do something similar to that. But all them queries (in some cases tens of thousands and a frequently updated index) hogs more CPU than I think it has to. I'm low on CPU (spent on real time collaborative filtering et.c.) but have more or less an unlimited amount of RAM.