NullPointerException during indexing in DocumentsWriter$ThreadState$FieldData.addPosition
-----------------------------------------------------------------------------------------
Key: LUCENE-1072
URL: https://issues.apache.org/jira/browse/LUCENE-1072
Project: Lucene - Java
Issue Type: Bug
Components: Index
Affects Versions: 2.3
Environment: Linux CentOS 5 x86_64 running on 2-core Pentium D, Java HotSpot(TM)
64-Bit Server VM (build 1.6.0_01-b06, mixed mode), using lucene-core-2007-11-29_02-49-31
Reporter: Alexei Dets
In my case during indexing sometimes appear documents with unusually large "words" - text-encoded
images in fact.
Attempt to add document that contains field with such token produces java.lang.IllegalArgumentException:
java.lang.IllegalArgumentException: term length 37944 exceeds max term length 16383
at org.apache.lucene.index.DocumentsWriter$ThreadState$FieldData.addPosition(DocumentsWriter.java:1492)
at org.apache.lucene.index.DocumentsWriter$ThreadState$FieldData.invertField(DocumentsWriter.java:1321)
at org.apache.lucene.index.DocumentsWriter$ThreadState$FieldData.processField(DocumentsWriter.java:1247)
at org.apache.lucene.index.DocumentsWriter$ThreadState.processDocument(DocumentsWriter.java:972)
at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:2202)
at org.apache.lucene.index.DocumentsWriter.addDocument(DocumentsWriter.java:2186)
at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1432)
at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1411)
This is expected, exception is caught and ignored. The problem is that after this IndexWriter
becomes somewhat corrupted and subsequent attempts to add documents to the index fail as well,
this time with NPE:
java.lang.NullPointerException
at org.apache.lucene.index.DocumentsWriter$ThreadState$FieldData.addPosition(DocumentsWriter.java:1497)
at org.apache.lucene.index.DocumentsWriter$ThreadState$FieldData.invertField(DocumentsWriter.java:1321)
at org.apache.lucene.index.DocumentsWriter$ThreadState$FieldData.processField(DocumentsWriter.java:1247)
at org.apache.lucene.index.DocumentsWriter$ThreadState.processDocument(DocumentsWriter.java:972)
at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:2202)
at org.apache.lucene.index.DocumentsWriter.addDocument(DocumentsWriter.java:2186)
at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1432)
at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1411)
This is 100% reproducible.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
|