lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-1385) IndexReader.isIndexCurrent()==false -> IndexReader.reopen() -> still index not current
Date Mon, 15 Sep 2008 13:43:44 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-1385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12631023#action_12631023
] 

Uwe Schindler commented on LUCENE-1385:
---------------------------------------

bq. So it sounds like you get a situation, on many computers, whereby when you call IndexReader.isCurrent
on an instance of IndexReader that you have open, it returns false. Yet when you call reopen,
it returns back the same reader? Is that right?

That's exactly what happens.

bq. Are you certain you are not making any changes with the reader (deletion or setNorm or
undeleteAll)? I can see one case where if you did make changes with the reader instance, and
you also forcefully unlock the index (using IndexReader.unlock or IndexWriter.unlock (on trunk))
that you could get yourself into this exact situation. But if you're not making any changes
with the reader I still can't explain it.

The IndexReader is not used for writing. It is used only with an IndexSearcher that does a
search with TopDocs and HitCollector without updating anything. Only stored fields are read.

bq. How are you sharing your index (what shared filesystem/OS)? (you said "in another virtual
machine a parallel job updates the index")

I have two JVM processes: One web server with web application that does the searches and hits
the bug. Another JVM does the index updates. Filesystem is UFS, EXT3 or NTFS (depending on
platform). All is local.

The interesting thing with this bug is:
The three machines use different indexes. Windows is my testing machine, Solaris one production
and Linux another production server, all using local filesystems for indexes. The interesting
thing is, that the bug does not show up very long time, but suddenly it shows up on all three
machines (which are independent). This is why I said: Maybe it is the timestamp in the index
version that for example wraps around 2^31 or something like that. All three machines are
not related to each other, but the bug happens at the same time.

bq. This really confuses me: if reopen() had returned the same reader, how can it then also
show all the new/updated documents? Do you know whether the index has 1 or more than 1 segments
when this problem is happening?

Sorry, this was wrong. When reopen returns the same reader it is unchanged, you are right!
:-)

bq. How do you tie in a cron-job into getting the IndexReader in your search server to call
reopen?

Its not a "real cronjob" its just a task in my web application that is executed each half
hour: check if IndexReader is current and if not, reopen it. This is what the "cron job" is
doing (with my fix for the problem):

if (!indexReader.isCurrent()) {
	IndexReader n=indexReader.reopen();
	if (n!=indexReader) {
		try {
			// reader was really reopened
			indexReader.close();
		} finally {
			indexReader=n;
		}
	} else {
		log.warn("Index was reopened but is still not up-to-date (maybe a bug in Lucene, we try
to investigate this). Doing a hard reopen.");
		n=IndexReader.open(....);
		try {
			indexReader.close();
		} finally {
			indexReader=n;
		}
	}
}


> IndexReader.isIndexCurrent()==false -> IndexReader.reopen() -> still index not
current
> --------------------------------------------------------------------------------------
>
>                 Key: LUCENE-1385
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1385
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 2.3.2
>         Environment: Linux, Solaris, Windows XP
>            Reporter: Uwe Schindler
>
> I found a strange error occurring with IndexReader.reopen. It is not always reproduceable,
it only happens sometimes, but strangely on all my computers with different platforms at the
same time. Maybe has something to to with the timestamp used in index versions.
> I have a search server using an IndexReader, that is openend in webapp startup and should
stay open. Every half an hour this web application checks, if the index is still current using
IndexReader.isCurrent(). When a parallel job that indexes documents (in another virtual machine)
and modifies the indexes, isCurrent() return TRUE. The half-hourly cron-job then uses IndexReader.reopen()
to reopen the index. But sometimes, directly after reopen() the Index is still not current
(and no updates occur). Again calling reopen does not change it, too. Searching on the index
shows all new/updated documents, but isCurrent() still return false. The problem with this
is, that now the index is reopened all the time, because the detection of a current index
does not work any more.
> I have now a workaround in my code to handle this: After calling IndexReader.reopen(),
I test for IndexReader.isCurrent(), and if not, I close it hard and open a new instance.
> Most times IndexReader.reopen works correct, but sometimes this error occurs. Looking
into the code of reopen(), I realized, that there is some extra check, if the Index has modifications,
and if yes the reopen call returns the original reader (this maybe the problem I have). But
the IndexReader is only used for searching, no updates occur.
> My questions: Why is there this check for modifications in reopen()? Why does this happen
only at certain times on all my servers with different platforms?
> I want to use reopen, because in future, when the new FieldCache will be reopen-aware
and does not everytime rebuild the full cache, it will be very important, to have this fixed.
At the moment, I have no problem with the case, that reopen may fail and I have to do a rough
reopen.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message