lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shai Erera (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-2420) "fdx size mismatch" overflow causes RuntimeException
Date Thu, 29 Apr 2010 18:19:53 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12862312#action_12862312
] 

Shai Erera commented on LUCENE-2420:
------------------------------------

I remember that the Integer.MAX_VAL is documented somewhere. I can try to look it up later.
But a lot of places in the API use int as the doc Id (IndexReader, ScoreDoc, even IndexWriter.max/numDocs()),
and so I think there's a strong hint about that limitation.

As for throwing the exception sooner, I don't think it will be correct. IndexWriter implements
transaction semantics. Until you call commit() or close(), whatever operations you've made
are not *officially* in the index yet. If your JVM dies before that, they will get lost. Therefore
throwing the exception earlier would be wrong. Also think that you intend to index 1000 docs
and delete 100,000. Would you want to get the exception while adding the docs, knowing that
you are about to delete much more soon?



> "fdx size mismatch" overflow causes RuntimeException
> ----------------------------------------------------
>
>                 Key: LUCENE-2420
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2420
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 3.0.1
>         Environment: CentOS 5.4
>            Reporter: Steven Bethard
>
> I just saw the following error:
> java.lang.RuntimeException: after flush: fdx size mismatch: -512764976 docs vs 30257618564
length in bytes of _0.fdx file exists?=true
>         at org.apache.lucene.index.StoredFieldsWriter.closeDocStore(StoredFieldsWriter.java:97)
>         at org.apache.lucene.index.DocFieldProcessor.closeDocStore(DocFieldProcessor.java:51)
>         at org.apache.lucene.index.DocumentsWriter.closeDocStore(DocumentsWriter.java:371)
>         at org.apache.lucene.index.IndexWriter.flushDocStores(IndexWriter.java:1724)
>         at org.apache.lucene.index.IndexWriter.doFlushInternal(IndexWriter.java:3565)
>         at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3491)
>         at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3482)
>         at org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:1658)
>         at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1621)
>         at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1585)
> Note the negative SegmentWriteState.numDocsInStore. I assume this is because Lucene has
a limit of 2 ^ 31 - 1 = 2147483647 (sizeof(int)) documents per index, though I couldn't find
this documented clearly anywhere. It would have been nice to get this error earlier, back
when I exceeded the limit, rather than now, after a bunch of indexing that was apparently
doomed to fail.
> Hence, two suggestions:
> * State clearly somewhere that the maximum number of documents in a Lucene index is sizeof(int).
> * Throw an exception when an IndexWriter first exceeds this number rather than only on
close.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message