lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steven Bethard (JIRA)" <>
Subject [jira] Created: (LUCENE-2420) "fdx size mismatch" overflow causes RuntimeException
Date Thu, 29 Apr 2010 01:36:52 GMT
"fdx size mismatch" overflow causes RuntimeException

                 Key: LUCENE-2420
             Project: Lucene - Java
          Issue Type: Bug
          Components: Index
    Affects Versions: 3.0.1
         Environment: CentOS 5.4
            Reporter: Steven Bethard

I just saw the following error:

java.lang.RuntimeException: after flush: fdx size mismatch: -512764976 docs vs 30257618564
length in bytes of _0.fdx file exists?=true
        at org.apache.lucene.index.StoredFieldsWriter.closeDocStore(
        at org.apache.lucene.index.DocFieldProcessor.closeDocStore(
        at org.apache.lucene.index.DocumentsWriter.closeDocStore(
        at org.apache.lucene.index.IndexWriter.flushDocStores(
        at org.apache.lucene.index.IndexWriter.doFlushInternal(
        at org.apache.lucene.index.IndexWriter.doFlush(
        at org.apache.lucene.index.IndexWriter.flush(
        at org.apache.lucene.index.IndexWriter.closeInternal(
        at org.apache.lucene.index.IndexWriter.close(
        at org.apache.lucene.index.IndexWriter.close(

Note the negative SegmentWriteState.numDocsInStore. I assume this is because Lucene has a
limit of 2 ^ 31 - 1 = 2147483647 (sizeof(int)) documents per index, though I couldn't find
this documented clearly anywhere. It would have been nice to get this error earlier, back
when I exceeded the limit, rather than now, after a bunch of indexing that was apparently
doomed to fail.

Hence, two suggestions:
* State clearly somewhere that the maximum number of documents in a Lucene index is sizeof(int).
* Throw an exception when an IndexWriter first exceeds this number rather than only on close.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message