lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "l0co (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (LUCENE-3422) IndeIndexWriter.optimize() throws FileNotFoundException and IOException
Date Sun, 12 May 2013 22:09:15 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-3422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13655651#comment-13655651
] 

l0co edited comment on LUCENE-3422 at 5/12/13 10:09 PM:
--------------------------------------------------------

Sure, this can happen because of other reasons (might be a bug in Hibernate Search?). This
works like this:

 1. I have IndexWriter writing to the index in Hibernate Search exclusive mode (it creates
Workspace with single IndexWriter which is not closed/re-created on each usage, but constantly
opened) with RAM flush threshold=2MB.
 2. IndexWriter has concurrent merge scheduler by default.
 3. I'm writing to the index using application UI and on the other window I'm observing the
index directory.
 4. After each write (entity save) new bunch of XXX.* files is created (_13.*, _14.* etc)
 5. After some time these files dissapears from the directory, for example I have 13,14,15,16,17,18,19
files and after the merge (?) process I have only 18,19 and the rest dissapears.
 6. 5. effect happens during the IndexWriter usage - when I save entitiy to the database again.
 7. Sometimes in this scenario I have FNFE.
 8. I caught the error with breakpoint and I see that during FNFE the IndexWriter has segmentInfos
corresponded to the files that already dissapeared from the index directory in current index
writer usage (ie. in the directory there are 18,19 files but the segmentInfos shows all 13,14,15,16,17,18,19).
 9. So, I suppose that when the writer has been invoked, the merge thread has removed these
files, but the (another, concurrent) write thread still sees them.
 10. This didn't happen (by now) when I switched to serial merge scheduler.
                
      was (Author: l0co):
    Sure, this can happen because of other reasons (might be a bug in Hibernate Search?).
This works like this:

 1. I have IndexWriter writing to the index in Hibernate Search exclusive mode (it creates
Workspace with single IndexWriter which is not closed/re-created on each usage, but constantly
opened) with RAM flush threshold=2MB.
 2. IndexWriter has concurrent merge scheduler by default.
 3. I'm writing to the index using application UI and on the other window I'm observing the
index directory.
 4. After each write (entity save) new bunch of XXX.* files is created (_13.*, _14.* etc)
 5. After some time these files dissapears from the directory, for example I have 13,14,15,16,17,18,19
files and after the merge (?) process I have only 18,19 and the rest dissapears.
 6. This happens during the IndexWriter usage - when I save entitiy to the database.
 7. Sometimes in this scenario I have FNFE.
 8. I caught the error with breakpoint and I see that during FNFE the IndexWriter has segmentInfos
corresponded to the files that already dissapeared from the index directory in current index
writer usage (ie. in the directory there are 18,19 files but the segmentInfos shows all 13,14,15,16,17,18,19).
 9. So, I suppose that when the writer has been invoked, the merge thread has removed these
files, but the (another, concurrent) write thread still sees them.
 10. This didn't happen (by now) when I switched to serial merge scheduler.
                  
> IndeIndexWriter.optimize() throws FileNotFoundException and IOException
> -----------------------------------------------------------------------
>
>                 Key: LUCENE-3422
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3422
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Elizabeth Nisha
>
> I am using lucene 3.0.2 search APIs for my application. 
> Indexed data is about 350MB and time taken for indexing is 25 hrs. Search indexing and
Optimization runs in two different threads. Optimization runs for every 1 hour and it doesn't
run while indexing is going on and vice versa. When optimization is going on using IndexWriter.optimize(),
FileNotFoundException and IOException are seen in my log and the index file is getting corrupted,
log says
> 1. java.io.IOException: No sub-file with id _5r8.fdt found 
> [The file name in this message changes over time (_5r8.fdt, _6fa.fdt, _6uh.fdt, ...,
_emv.fdt) ]
> 2. java.io.FileNotFoundException: /local/groups/necim/index_5.3/index/_bdx.cfs (No such
file or directory)  
> 3. java.io.FileNotFoundException: /local/groups/necim/index_5.3/index/_hkq.cfs (No such
file or directory)
> 	Stack trace: java.io.IOException: background merge hit exception: _hkp:c100->_hkp
_hkq:c100->_hkp _hkr:c100->_hkr _hks:c100->_hkr _hxb:c5500 _hx5:c1000 _hxc:c198
> 84 into _hxd [optimize] [mergeDocStores]
>        at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2359)
>        at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2298)
>        at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2268)
>        at com.telelogic.cs.search.SearchIndex.doOptimize(SearchIndex.java:130)
>        at com.telelogic.cs.search.SearchIndexerThread$1.run(SearchIndexerThread.java:337)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>        at java.lang.Thread.run(Thread.java:662)
> Caused by: java.io.FileNotFoundException: /local/groups/necim/index_5.3/index/_hkq.cfs
(No such file or directory)
>        at java.io.RandomAccessFile.open(Native Method)
>        at java.io.RandomAccessFile.<init>(RandomAccessFile.java:212)
>        at org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput$Descriptor.<init>(SimpleFSDirectory.java:76)
>        at org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.<init>(SimpleFSDirectory.java:97)
>        at org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.<init>(NIOFSDirectory.java:87)
>        at org.apache.lucene.store.NIOFSDirectory.openInput(NIOFSDirectory.java:67)
>        at org.apache.lucene.index.CompoundFileReader.<init>(CompoundFileReader.java:67)
>        at org.apache.lucene.index.SegmentReader$CoreReaders.<init>(SegmentReader.java:114)
>        at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:590)
>        at org.apache.lucene.index.IndexWriter$ReaderPool.get(IndexWriter.java:616)
>        at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4309)
>        at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3965)
>        at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:231)
>        at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:288)


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message