Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 82684 invoked from network); 9 Sep 2008 15:43:39 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 9 Sep 2008 15:43:39 -0000 Received: (qmail 60569 invoked by uid 500); 9 Sep 2008 15:43:31 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 60532 invoked by uid 500); 9 Sep 2008 15:43:31 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 60523 invoked by uid 99); 9 Sep 2008 15:43:31 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 09 Sep 2008 08:43:31 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of ning.li.li@gmail.com designates 74.125.46.31 as permitted sender) Received: from [74.125.46.31] (HELO yw-out-2324.google.com) (74.125.46.31) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 09 Sep 2008 15:42:33 +0000 Received: by yw-out-2324.google.com with SMTP id 3so232712ywj.5 for ; Tue, 09 Sep 2008 08:42:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:in-reply-to:mime-version:content-type :content-transfer-encoding:content-disposition:references; bh=VRSJwozVkmAV9YcqZT3WAM2Mt7LqKfMhd/Zar+0iXxg=; b=C7JcKYp+MD/c3Zk+V0tyDWNEDJ9eAosXpVTvRiKcKx7MqMdvxfU3446fFaXE4Ab/Yc RJ4QSSnpbD4HRp2wIHLfbiJLWc176TsvB3ZM0ZOwM0cPomlFiRdaGuRGElGV8yzn+xlp 5FSI30ii+D944QA4+lx3Q/VKnLoqdGFgchiag= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:in-reply-to:mime-version :content-type:content-transfer-encoding:content-disposition :references; b=RFz5sNHBlcor3boCDBsd/6L9sp8D4Qt98j+mXzMvgxVYN5gdUMcZigNUgkyLsqOeNa CRXUZa95P2v4abWGWI26UqpefzDqxwwuSZDrc2p+klejIa84qGBBkFD55MzqAzPOz6FS QEhTW044RaDkNrokCshUucjggMG8PAG/MV7WM= Received: by 10.151.109.11 with SMTP id l11mr23619845ybm.204.1220974974652; Tue, 09 Sep 2008 08:42:54 -0700 (PDT) Received: by 10.151.158.18 with HTTP; Tue, 9 Sep 2008 08:42:54 -0700 (PDT) Message-ID: Date: Tue, 9 Sep 2008 11:42:54 -0400 From: "Ning Li" To: java-dev@lucene.apache.org Subject: Re: Realtime Search for Social Networks Collaboration In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <163962.64981.qm@web26004.mail.ukl.yahoo.com> <48C53546.40308@gmail.com> <85d3c3b60809080736m7d651c06x71dfc4341c24c58e@mail.gmail.com> <56742D78-F300-4DCB-93E7-E6A2CD4B66FF@mikemccandless.com> X-Virus-Checked: Checked by ClamAV on apache.org On Mon, Sep 8, 2008 at 4:23 PM, Yonik Seeley wrote: >> I thought an index reader which supports real-time search no longer >> maintains a static view of an index? > > It seems advantageous to just make it really cheap to get a new view > of the index (if you do it for every search, t amounts to the same > thing, right?) Sounds like these light-weight views of the index are backed up by something dynamic, right? > Quite a bit of code in Lucene assumes a static view of > the Index I think (even IndexSearcher), and it's nice to have a stable > index view for the duration of a single request. Agree. On Tue, Sep 9, 2008 at 10:02 AM, Yonik Seeley wrote: > Yeah, I think the underlying RandomAccessFile might do the right > thing, but IndexInput isn't required to see any changes on the fly > (and current implementations don't) so at a minimum it would be a > change of IndexInput semantics. Maybe there would need to be a > refresh() function added, or we would need to require a specific > Directory impl? > > OR, if all writes are append-only, perhaps we don't ever need to > invalidate the read buffer and would just need to remove the current > logic that caches the file length and then let the underlying > RandomAccessFile do the EOF checking. We cannot assume it's always RandomAccessFile, can we? So we may have to flush after writing each document. Even so, this may not be sufficient for some FS such as HDFS... Is it reasonable in this case to keep in memory everything including stored fields and term vectors? Cheers, Ning --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org