lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andy Hind (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-415) Merge error during add to index (IndexOutOfBoundsException)
Date Thu, 17 Nov 2005 12:40:41 GMT
    [ http://issues.apache.org/jira/browse/LUCENE-415?page=comments#action_12357882 ] 

Andy Hind commented on LUCENE-415:
----------------------------------

And I can reproduce it .....on 1.4.3

When FSDirectory.createFile creates a FSOutputStream the random access file may already exist
and contain data. The content is not cleaned out.

So if segment merging is taking place to a new segment, and the merge has written data to
this file ....and the machine crashes/app is terminated .... you can end up with a partial
or full segment file that the segment infos knows nothing about. If you restart, then any
merge will try to reuse the same file name...and the content it contains.....

To reproduce the issue I created the next segment file by copying one that already exists
.... and bang....on the next merge

I suggest that in FSOutputStream sets the file length to 0 on initialisation (as well as opening
the channel to the file which can aslo produce some nasty deferred IO erorrs in windows XP
a least)

I am not sure of any side effect of this but will test it.

We are seeing this 2-3 times a day if under heavy load or single thread and killing the app
at random, which may be in the procedss of a segment write... 


> Merge error during add to index (IndexOutOfBoundsException)
> -----------------------------------------------------------
>
>          Key: LUCENE-415
>          URL: http://issues.apache.org/jira/browse/LUCENE-415
>      Project: Lucene - Java
>         Type: Bug
>   Components: Index
>     Versions: 1.4
>  Environment: Operating System: Linux
> Platform: Other
>     Reporter: Daniel Quaroni
>     Assignee: Lucene Developers

>
> I've been batch-building indexes, and I've build a couple hundred indexes with 
> a total of around 150 million records.  This only happened once, so it's 
> probably impossible to reproduce, but anyway... I was building an index with 
> around 9.6 million records, and towards the end I got this:
> java.lang.IndexOutOfBoundsException: Index: 54, Size: 24
>         at java.util.ArrayList.RangeCheck(ArrayList.java:547)
>         at java.util.ArrayList.get(ArrayList.java:322)
>         at org.apache.lucene.index.FieldInfos.fieldInfo(FieldInfos.java:155)
>         at org.apache.lucene.index.FieldInfos.fieldName(FieldInfos.java:151)
>         at org.apache.lucene.index.SegmentTermEnum.readTerm(SegmentTermEnum.java
> :149)
>         at org.apache.lucene.index.SegmentTermEnum.next
> (SegmentTermEnum.java:115)
>         at org.apache.lucene.index.SegmentMergeInfo.next
> (SegmentMergeInfo.java:52)
>         at org.apache.lucene.index.SegmentMerger.mergeTermInfos
> (SegmentMerger.java:294)
>         at org.apache.lucene.index.SegmentMerger.mergeTerms
> (SegmentMerger.java:254)
>         at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:93)
>         at org.apache.lucene.index.IndexWriter.mergeSegments
> (IndexWriter.java:487)
>         at org.apache.lucene.index.IndexWriter.maybeMergeSegments
> (IndexWriter.java:458)
>         at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:310)
>         at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:294)

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message