lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-1385) IndexReader.isIndexCurrent()==false -> IndexReader.reopen() -> still index not current
Date Sun, 21 Sep 2008 19:09:44 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-1385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12633089#action_12633089
] 

Michael McCandless commented on LUCENE-1385:
--------------------------------------------


OK I think I found the bug.

>From those prints above I can see your current IndexReader was opened
when the index had a single segment (so, it's a SegmentReader).  And,
the changed index also has a single segment by the same name... so we
call SegmentReader.reopenSegment to do the reopening, which has logic
to return itself if it detects no changes (to norms or deleetions).
You are somehow hitting that logic.

The bug seems to boil down to, somehow, IndexWriter is writing a new
segments_N file for a single-segment index yet no actual changes were
made to the segment.

The bug is rather harmless: the reopen call does no real work (just
returns your current IndexReader instance), and, it's doing that
because there were in fact no actual changes to the index, just
somehow a new segments_N file was written.

I found one case where IndexWriter can do this, which is if you open
the writer, call deleteDocuments but no docs actually match the Term,
then close the writer.

Is it possible that your indexing job that wakes up and only makes calls
to deleteDocuments yet no documents matched the deleted terms?  If
not... can you capture the details of exactly what your indexing job
did just before you hit the reopen failure?  It could be another
"no-op" action in IndexWriter that then writes a segments_N file.


> IndexReader.isIndexCurrent()==false -> IndexReader.reopen() -> still index not
current
> --------------------------------------------------------------------------------------
>
>                 Key: LUCENE-1385
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1385
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 2.3.2
>         Environment: Linux, Solaris, Windows XP
>            Reporter: Uwe Schindler
>         Attachments: LUCENE-1385.patch
>
>
> I found a strange error occurring with IndexReader.reopen. It is not always reproduceable,
it only happens sometimes, but strangely on all my computers with different platforms at the
same time. Maybe has something to to with the timestamp used in index versions.
> I have a search server using an IndexReader, that is openend in webapp startup and should
stay open. Every half an hour this web application checks, if the index is still current using
IndexReader.isCurrent(). When a parallel job that indexes documents (in another virtual machine)
and modifies the indexes, isCurrent() return TRUE. The half-hourly cron-job then uses IndexReader.reopen()
to reopen the index. But sometimes, directly after reopen() the Index is still not current
(and no updates occur). Again calling reopen does not change it, too. Searching on the index
shows all new/updated documents, but isCurrent() still return false. The problem with this
is, that now the index is reopened all the time, because the detection of a current index
does not work any more.
> I have now a workaround in my code to handle this: After calling IndexReader.reopen(),
I test for IndexReader.isCurrent(), and if not, I close it hard and open a new instance.
> Most times IndexReader.reopen works correct, but sometimes this error occurs. Looking
into the code of reopen(), I realized, that there is some extra check, if the Index has modifications,
and if yes the reopen call returns the original reader (this maybe the problem I have). But
the IndexReader is only used for searching, no updates occur.
> My questions: Why is there this check for modifications in reopen()? Why does this happen
only at certain times on all my servers with different platforms?
> I want to use reopen, because in future, when the new FieldCache will be reopen-aware
and does not everytime rebuild the full cache, it will be very important, to have this fixed.
At the moment, I have no problem with the case, that reopen may fail and I have to do a rough
reopen.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message