Return-Path: Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: (qmail 6260 invoked from network); 30 Apr 2010 07:33:00 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 30 Apr 2010 07:33:00 -0000 Received: (qmail 94694 invoked by uid 500); 30 Apr 2010 07:26:18 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 94649 invoked by uid 500); 30 Apr 2010 07:26:18 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 94642 invoked by uid 99); 30 Apr 2010 07:26:18 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 30 Apr 2010 07:26:18 +0000 X-ASF-Spam-Status: No, hits=-1371.0 required=10.0 tests=ALL_TRUSTED,AWL X-Spam-Check-By: apache.org Received: from [140.211.11.22] (HELO thor.apache.org) (140.211.11.22) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 30 Apr 2010 07:26:17 +0000 Received: from thor (localhost [127.0.0.1]) by thor.apache.org (8.13.8+Sun/8.13.8) with ESMTP id o3U7Puj9018398 for ; Fri, 30 Apr 2010 07:25:56 GMT Message-ID: <26772927.28001272612356523.JavaMail.jira@thor> Date: Fri, 30 Apr 2010 03:25:56 -0400 (EDT) From: "Steven Bethard (JIRA)" To: dev@lucene.apache.org Subject: [jira] Commented: (LUCENE-2420) "fdx size mismatch" overflow causes RuntimeException In-Reply-To: <16828809.14091272505012507.JavaMail.jira@thor> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/LUCENE-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12862571#action_12862571 ] Steven Bethard commented on LUCENE-2420: ---------------------------------------- Thanks, yeah, something more explicit and easier to find with Google would be good. Yeah, I mean that code should throw the exception. Actually, anywhere that checks the value of numDocsInStore could/should throw the exception, e.g. synchronized public void flush(SegmentWriteState state) throws IOException { if (state.numDocsInStore > 0) { synchronized public void closeDocStore(SegmentWriteState state) throws IOException { final int inc = state.numDocsInStore - lastDocID; if (inc > 0) { Don't know if SegmentWriteState.numDocsInStore is used anywhere else (I haven't loaded it into an IDE to look at files other than StoredFieldsWriter), but in at least these two cases it would be easy to give an exception explaining that the maximum number of documents had been exceeded. Alternatively, you could try to fix the code to work correctly after integer overflow (to support the transaction use case you described above) though it's less obvious to me how to do that correctly everywhere. Probably involves changing some "> 0"s into "!= 0"s and being careful in a few other ways. > "fdx size mismatch" overflow causes RuntimeException > ---------------------------------------------------- > > Key: LUCENE-2420 > URL: https://issues.apache.org/jira/browse/LUCENE-2420 > Project: Lucene - Java > Issue Type: Bug > Components: Index > Affects Versions: 3.0.1 > Environment: CentOS 5.4 > Reporter: Steven Bethard > > I just saw the following error: > java.lang.RuntimeException: after flush: fdx size mismatch: -512764976 docs vs 30257618564 length in bytes of _0.fdx file exists?=true > at org.apache.lucene.index.StoredFieldsWriter.closeDocStore(StoredFieldsWriter.java:97) > at org.apache.lucene.index.DocFieldProcessor.closeDocStore(DocFieldProcessor.java:51) > at org.apache.lucene.index.DocumentsWriter.closeDocStore(DocumentsWriter.java:371) > at org.apache.lucene.index.IndexWriter.flushDocStores(IndexWriter.java:1724) > at org.apache.lucene.index.IndexWriter.doFlushInternal(IndexWriter.java:3565) > at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3491) > at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3482) > at org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:1658) > at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1621) > at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1585) > Note the negative SegmentWriteState.numDocsInStore. I assume this is because Lucene has a limit of 2 ^ 31 - 1 = 2147483647 (sizeof(int)) documents per index, though I couldn't find this documented clearly anywhere. It would have been nice to get this error earlier, back when I exceeded the limit, rather than now, after a bunch of indexing that was apparently doomed to fail. > Hence, two suggestions: > * State clearly somewhere that the maximum number of documents in a Lucene index is sizeof(int). > * Throw an exception when an IndexWriter first exceeds this number rather than only on close. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail: dev-help@lucene.apache.org