Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 16825 invoked from network); 2 Apr 2007 13:32:58 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 2 Apr 2007 13:32:58 -0000 Received: (qmail 9046 invoked by uid 500); 2 Apr 2007 13:32:57 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 9016 invoked by uid 500); 2 Apr 2007 13:32:57 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 9001 invoked by uid 99); 2 Apr 2007 13:32:57 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 02 Apr 2007 06:32:57 -0700 X-ASF-Spam-Status: No, hits=2.0 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (herse.apache.org: domain of erickerickson@gmail.com designates 66.249.92.168 as permitted sender) Received: from [66.249.92.168] (HELO ug-out-1314.google.com) (66.249.92.168) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 02 Apr 2007 06:32:49 -0700 Received: by ug-out-1314.google.com with SMTP id k40so1661968ugc for ; Mon, 02 Apr 2007 06:32:26 -0700 (PDT) DKIM-Signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=ez+Eohgwf5BCdjnkOPCaq+tbXhPzYYGDDZ0USlEa9NBREfMZPkxBgChRbAhGTYTOTM9r3sopUSQ37eC28Cmsf+yqYmfqdj3lMS0RsWAMwVLHC8BIVg1YqePaMSsA/lsGRAAY/BJklfxJiWzRWrthWOXz61veW+Eox8H8PrPG0sQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=gJ+xEkxT3U6wR3DLRBbylt8XK3Mp6RlGNY0aNnrmQtHWVMo1Qw9cXza+HK1vg+zTsiAgttlUau9L3aJ49/mKZ/8TR0NPsbMWLmY3l+UBf5u1nKbFhTQVHy8JzxhzQBtF/oR6q6soNh0XI7fIqkgZhzzOpIvpaT14fxpzxx9EqFc= Received: by 10.114.52.1 with SMTP id z1mr1790176waz.1175520745359; Mon, 02 Apr 2007 06:32:25 -0700 (PDT) Received: by 10.114.58.3 with HTTP; Mon, 2 Apr 2007 06:32:25 -0700 (PDT) Message-ID: <359a92830704020632w6a8d4178w259ccdbb00070bf1@mail.gmail.com> Date: Mon, 2 Apr 2007 09:32:25 -0400 From: "Erick Erickson" To: java-user@lucene.apache.org Subject: Re: Searches fail while indexwriter is open In-Reply-To: <9789072.post@talk.nabble.com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_108772_9397154.1175520745299" References: <9789072.post@talk.nabble.com> X-Virus-Checked: Checked by ClamAV on apache.org ------=_Part_108772_9397154.1175520745299 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline Yes, you can search while index writes are taking place, but.... When you open an index reader, it essentially takes a snapshot of the index and further modifications of the index are not visible to that searcher as long as it's open. You must close and re-open the reader (and associated searchers) to see your changes. How this interacts with index writer flushing is...er...well I don't exactly know, but this could well be an issue... I wonder if this is what you're seeing. In broad terms, the base question is how quickly you need to see changes in the index reflected in your search results. I suspect that the 10 file thing is a red herring, what do you have your indexwriter parameters set at? Especially maxbuffereddocs (which has a default value, perhaps not coincidentally, of 10)....... Lucene 2.1 has an IndexWriter.flush() method that could help...... Erick On 4/2/07, baronDodd wrote: > > > I am currently writing a Lucene application and having a huge headache > with > concurrency. > > My requirements are that each time a file is indexed a search on its path > is > performed to see if an update (delete then re-index) is required. If a > document with the same path exists then an IndexReader deletes the doc and > then a writer reindexes the fiel. Sadly due to requirements the deletes > and > indexes can not be batch performed and I am constantly opening and closing > the IndexReader and IndexWriter between multiple threads. Everything has > been working fine and seems thread safe apart from this: > > If I index a test batch of 10 files and then once again a few minutes > later > repeat the operation on the same files then all 10 are updated ok. However > when I perform the same test with more than about 10 files then my > searches > fail to find about 25% of the already existing files and I end up with > duplicate entries in the index. I have managed to fix this by closing the > indexWriter every time an update search is performed but this has taken > performance to almost embarrasing levels! My understanding was that you > could search a Lucene index with an IndexSearcher while any write > operations > are taking place? Is it possible that the search skips segments which are > currently being written to? > -- > View this message in context: > http://www.nabble.com/Searches-fail-while-indexwriter-is-open-tf3505182.html#a9789072 > Sent from the Lucene - Java Users mailing list archive at Nabble.com. > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > ------=_Part_108772_9397154.1175520745299--