Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 92659 invoked from network); 12 Oct 2009 17:28:54 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 12 Oct 2009 17:28:54 -0000 Received: (qmail 54359 invoked by uid 500); 12 Oct 2009 17:28:52 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 54298 invoked by uid 500); 12 Oct 2009 17:28:51 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 54288 invoked by uid 99); 12 Oct 2009 17:28:51 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 12 Oct 2009 17:28:51 +0000 X-ASF-Spam-Status: No, hits=-2.6 required=5.0 tests=BAYES_00,HTML_MESSAGE X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jake.mannix@gmail.com designates 209.85.211.184 as permitted sender) Received: from [209.85.211.184] (HELO mail-yw0-f184.google.com) (209.85.211.184) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 12 Oct 2009 17:28:48 +0000 Received: by ywh14 with SMTP id 14so622178ywh.20 for ; Mon, 12 Oct 2009 10:28:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=RmcxLjf8zvFAcmZk9Yot7dDxBhUQ8ttAhK50/AYzPAo=; b=l9riWPj8pOSVotdhDNia3kxkWIcZU6ySZnNKCdeGQWz4hqgZbkEGmRMRjTN4QuhEia HPogOdA2DbO28GSuSYV5hHfqWaSAS3yN1Mz6GQJyWHWkr4hMck8x3NHW4jSR0sdadn3F oKmo/3ScCMe+tC0PJSz26WKMgF8Pfir9F8Xwg= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=Zq2DavHCU/JD2OneDPgbpruR6GrT+RTyni7fYr/1VrpYbQFQEbhlVfifc5vnqFCOeL riXj8odJgPiY3qwH96HdOoO70nD0x2tROXtlZbz11p7AYS6JXFvnJVm6gYw+M+oCGkOO p3cYIF7wpIu2FiGX+PTMbxAg9X5Ink+ePhZfU= MIME-Version: 1.0 Received: by 10.90.129.15 with SMTP id b15mr3725002agd.64.1255368507602; Mon, 12 Oct 2009 10:28:27 -0700 (PDT) In-Reply-To: <25852756.post@talk.nabble.com> References: <25852756.post@talk.nabble.com> Date: Mon, 12 Oct 2009 10:28:27 -0700 Message-ID: <4b124c310910121028o50f89352x3d371a2aa330da70@mail.gmail.com> Subject: Re: Realtime search best practices From: Jake Mannix To: java-user@lucene.apache.org Content-Type: multipart/alternative; boundary=00163630fe43f538e10475c04129 --00163630fe43f538e10475c04129 Content-Type: text/plain; charset=ISO-8859-1 Hi Cedric, I don't know of anyone with a substantial throughput production system who is doing realtime search with the 2.9 improvements yet (and in fact, no serious performance analysis has been done on these even "in the lab" so to speak: follow https://issues.apache.org/jira/browse/LUCENE-1577 to track work on this), so some experimentation will be necessary to know how well it fits in your environment. Your approach has the basic components of how to do 2.9 NRT search, but it's missing the point when you're making your commit() calls. Your choices here depend on some tradeoffs, as lucene provides ACID-like transactional semantics whereby if you decide to commit() after every add(), then yes, getReader() will be up-to-date with the most recent commit(), but at a cost of indexing throughput (and much more frequent segment merges), at least in comparison to only calling commit() at a slower rate (but calling commit() less frequently means, of course, that you only have readers as fresh as your most recent commit). Also, you have to be aware that there are no guarantees as far as realtimeliness is concerned with 2.9 NRT - if there is an addIndexes() going on in anther thread on your IndexWriter, this is another instance where your getReader() call won't block, but also won't necessarily get access to the all of these new segments if the addIndexes() hasn't completed yet. Please post here any results you find with this - this is a very new feature and seeing how it works in the wild would be very helpful to everyone else who is interested. -jake On Mon, Oct 12, 2009 at 2:24 AM, melix wrote: > > Hi, > > I'm going to replace an old reader/writer synchronization mechanism we had > implemented with the new near realtime search facilities in Lucene 2.9. > However, it's still a bit unclear on how to efficiently do it. > > Is the following implementation the good way to do achieve it ? The context > is concurrent read/writes on an index : > > 1. create a Directory instance > 2. create a writer on this directory > 3. on each write request, add document to the writer > 4. on each read request, > a. use writer.getReader() to obtain an up-to-date reader > b. create an IndexSearcher with that reader > c. perform Query > d. close IndexSearcher > 5. on application close > a. close writer > b. close directory > > While this seems to be ok, I'm really wondering about the performance of > opening a searcher for each request. I could introduce some kind of delay > and cache a searcher for some seconds, but I'm not sure it's the best thing > to do. > > Thanks, > > Cedric > > > -- > View this message in context: > http://www.nabble.com/Realtime-search-best-practices-tp25852756p25852756.html > Sent from the Lucene - Java Users mailing list archive at Nabble.com. > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > --00163630fe43f538e10475c04129--