lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harald Kirsch <Harald.Kir...@raytion.com>
Subject Problem with near realtime search
Date Fri, 03 Aug 2012 13:41:56 GMT
I am trying to (mis)use Lucene a bit like a NoSQL database or, rather, a 
persistent map. I am entering 38000 documents at a rate of 1000/s to the 
index. Because each item add may be actually an update, I have a 
sequence of read/change/write for each of the documents.

All goes well until when just after writing the last item, I run a query 
that retrieves about 16000 documents. All docids are collected in a 
Collector, and, yes, I make sure to rebase the docIds. Then I iterate 
over all docIds found and retrieve the documents basically like this:

   for(int docId : docIds) {
     Document d = getSearcher().doc(docId);
     ..
   }

where getSearcher() uses IndexReader.openIfChanged() to always get the 
most current searcher and makes sure to eventually close the old searcher.


At document 15940 I get an exception like this:

Exception in thread "main" java.lang.IllegalArgumentException: docID 
must be >= 0 and < maxDoc=1 (got docID=1)
	at org.apache.lucene.index.SegmentReader.document(SegmentReader.java:490)
	at 
org.apache.lucene.index.DirectoryReader.document(DirectoryReader.java:568)
	at org.apache.lucene.search.IndexSearcher.doc(IndexSearcher.java:264)

I can get rid of the Exception by one of two ways that I both don't like:

1) Put a Thread.sleep(1000) just before running the query+document 
retrieval part.

2) Use the same IndexSearcher to retrieve all documents instead of 
calling getSearcher for each document retrieval.

This is just a test single threaded test program. I only see Lucene 
Merge threads in jvisualvm besides the main thread. A breakpoint on the 
exception shows that org.apache.lucene.index.DirectoryReader.document 
does seem to have wrong segments, which triggers the Exception.

Since Lucene 3.6.1 is in productive use for some time I doubt it is a 
bug in Lucene, but I don't see what I am doing wrong. It might be 
connected to trying to get the freshest IndexReader for retrieving 
documents.

Any better ideas or explanations?

Harald.

-- 
Harald Kirsch


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message