From java-dev-return-12225-apmail-lucene-java-dev-archive=lucene.apache.org@lucene.apache.org Thu Nov 17 12:41:07 2005 Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 41769 invoked from network); 17 Nov 2005 12:41:06 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 17 Nov 2005 12:41:06 -0000 Received: (qmail 16207 invoked by uid 500); 17 Nov 2005 12:41:04 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 16172 invoked by uid 500); 17 Nov 2005 12:41:03 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 16159 invoked by uid 99); 17 Nov 2005 12:41:03 -0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received: from [192.87.106.226] (HELO ajax.apache.org) (192.87.106.226) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 17 Nov 2005 04:41:02 -0800 Received: from ajax.apache.org (ajax.apache.org [127.0.0.1]) by ajax.apache.org (Postfix) with ESMTP id AA245E1 for ; Thu, 17 Nov 2005 13:40:41 +0100 (CET) Message-ID: <1718481938.1132231241673.JavaMail.jira@ajax.apache.org> Date: Thu, 17 Nov 2005 13:40:41 +0100 (CET) From: "Andy Hind (JIRA)" To: java-dev@lucene.apache.org Subject: [jira] Commented: (LUCENE-415) Merge error during add to index (IndexOutOfBoundsException) Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N [ http://issues.apache.org/jira/browse/LUCENE-415?page=comments#action_12357882 ] Andy Hind commented on LUCENE-415: ---------------------------------- And I can reproduce it .....on 1.4.3 When FSDirectory.createFile creates a FSOutputStream the random access file may already exist and contain data. The content is not cleaned out. So if segment merging is taking place to a new segment, and the merge has written data to this file ....and the machine crashes/app is terminated .... you can end up with a partial or full segment file that the segment infos knows nothing about. If you restart, then any merge will try to reuse the same file name...and the content it contains..... To reproduce the issue I created the next segment file by copying one that already exists .... and bang....on the next merge I suggest that in FSOutputStream sets the file length to 0 on initialisation (as well as opening the channel to the file which can aslo produce some nasty deferred IO erorrs in windows XP a least) I am not sure of any side effect of this but will test it. We are seeing this 2-3 times a day if under heavy load or single thread and killing the app at random, which may be in the procedss of a segment write... > Merge error during add to index (IndexOutOfBoundsException) > ----------------------------------------------------------- > > Key: LUCENE-415 > URL: http://issues.apache.org/jira/browse/LUCENE-415 > Project: Lucene - Java > Type: Bug > Components: Index > Versions: 1.4 > Environment: Operating System: Linux > Platform: Other > Reporter: Daniel Quaroni > Assignee: Lucene Developers > > I've been batch-building indexes, and I've build a couple hundred indexes with > a total of around 150 million records. This only happened once, so it's > probably impossible to reproduce, but anyway... I was building an index with > around 9.6 million records, and towards the end I got this: > java.lang.IndexOutOfBoundsException: Index: 54, Size: 24 > at java.util.ArrayList.RangeCheck(ArrayList.java:547) > at java.util.ArrayList.get(ArrayList.java:322) > at org.apache.lucene.index.FieldInfos.fieldInfo(FieldInfos.java:155) > at org.apache.lucene.index.FieldInfos.fieldName(FieldInfos.java:151) > at org.apache.lucene.index.SegmentTermEnum.readTerm(SegmentTermEnum.java > :149) > at org.apache.lucene.index.SegmentTermEnum.next > (SegmentTermEnum.java:115) > at org.apache.lucene.index.SegmentMergeInfo.next > (SegmentMergeInfo.java:52) > at org.apache.lucene.index.SegmentMerger.mergeTermInfos > (SegmentMerger.java:294) > at org.apache.lucene.index.SegmentMerger.mergeTerms > (SegmentMerger.java:254) > at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:93) > at org.apache.lucene.index.IndexWriter.mergeSegments > (IndexWriter.java:487) > at org.apache.lucene.index.IndexWriter.maybeMergeSegments > (IndexWriter.java:458) > at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:310) > at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:294) -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org