lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From baronDodd <campba...@hotmail.com>
Subject Re: Searches fail while indexwriter is open
Date Mon, 02 Apr 2007 14:57:12 GMT

Many thanks for your response, some good points which I had not thought of,
but unfortunately the problem remains.

To clarify my index sequence in pseudo-code is this:

if( fileExists( filePath ) ){
       
       createIndexReader();
       delectDoc( docNumber );
}
createIndexWriter();
indexDoc();

I am indexing 30 files, and then repeating the same index job minutes later
which should result in 30 updates. My createIndexReader and writer methods
close the opposite modifier ok. My isExisting() method also creates a new
IndexSearcher instance. The question regarding the need to see index
modfications so quickly is not quite relevant as it is documents indexed
minutes or hours ago durin the first index job which are not being found,
simply because I have an indexWriter open at the time.

Sample from the first job:

Index Writer Open
Failed to find: C:\CBIS_test\updatetest\7.txt
7.txt written to index 
Failed to find: C:\CBIS_test\updatetest\5.txt
5.txt written to index 
Totals : Indexed 30 Updated: 0 (content analyzed: 30) files in 859
milliseconds
flushing
Index Writer Open
optimizing index

And then the 2nd job:

8.txt deleted 
Index Reader closed
Index Writer Open
8.txt written to index 
Failed to find: C:\CBIS_test\updatetest\7.txt
7.txt written to index 
Index Writer closed
Totals : Indexed 10 Updated: 20 (content analyzed: 30) files in 3606
milliseconds
flushing
Index Writer Open
optimizing index

I may be missing something obvious, the maxbuffereddocs value made no
difference when I changed it.



Erick Erickson wrote:
> 
> Yes, you can search while index writes are taking place, but....
> 
> When you open an index reader, it essentially takes a snapshot
> of the index and further modifications of the index are not visible to
> that searcher as long as it's open. You must close and re-open the
> reader (and associated searchers) to see your changes. How this
> interacts with index writer flushing is...er...well I don't exactly know,
> but this could well be an issue...
> 
> I wonder if this is what you're seeing. In broad terms, the base
> question is how quickly you need to see changes in the index reflected
> in your search results.
> 
> I suspect that the 10 file thing is a red herring, what do you have your
> indexwriter parameters set at? Especially maxbuffereddocs (which has
> a default value, perhaps not coincidentally, of 10).......
> 
> Lucene 2.1 has an IndexWriter.flush() method that could help......
> 
> Erick
> 
> On 4/2/07, baronDodd <campbaron@hotmail.com> wrote:
>>
>>
>> I am currently writing a Lucene application and having a huge headache
>> with
>> concurrency.
>>
>> My requirements are that each time a file is indexed a search on its path
>> is
>> performed to see if an update (delete then re-index) is required. If a
>> document with the same path exists then an IndexReader deletes the doc
>> and
>> then a writer reindexes the fiel. Sadly due to requirements the deletes
>> and
>> indexes can not be batch performed and I am constantly opening and
>> closing
>> the IndexReader and IndexWriter between multiple threads. Everything has
>> been working fine and seems thread safe apart from this:
>>
>> If I index a test batch of 10 files and then once again a few minutes
>> later
>> repeat the operation on the same files then all 10 are updated ok.
>> However
>> when I perform the same test with more than about 10 files then my
>> searches
>> fail to find about 25% of the already existing files and I end up with
>> duplicate entries in the index. I have managed to fix this by closing the
>> indexWriter every time an update search is performed but this has taken
>> performance to almost embarrasing levels! My understanding was that you
>> could search a Lucene index with an IndexSearcher while any write
>> operations
>> are taking place? Is it possible that the search skips segments which are
>> currently being written to?
>> --
>> View this message in context:
>> http://www.nabble.com/Searches-fail-while-indexwriter-is-open-tf3505182.html#a9789072
>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
> 
> 

-- 
View this message in context: http://www.nabble.com/Searches-fail-while-indexwriter-is-open-tf3505182.html#a9792429
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message