lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <>
Subject Re: Reuse single document and fields
Date Fri, 01 Feb 2008 19:38:07 GMT

As far as I know, Lucene should accept a field with an empty string  
value -- how did you hit the IllegalArgumentException?


Jay wrote:

> Thanks, Michael, for your quick reply and explanation.
> One related question: is it true that Lucene indexer will reject a  
> field  that has the empty string value? (I saw an  
> IllegalArgumentException).
> Will be nice if lucene just skip such a field silently, esp, for  
> the new 2.3 api.
> Jay
> Michael McCandless wrote:
>> yu wrote:
>>> Hi,
>>> I am trying to use the latest 2.3 API on  Field to improve the  
>>> indexing performance by reusing Documents and  Fields. After  
>>> reading lucene-java wiki and the java doc on Field, I have a  
>>> couple of questions about the comment in Field.setValue(), namely,
>>> "Note that you should only use this method after the Field has  
>>> been consumed (ie, the Document  containing this Field has been  
>>> added to the index)":
>>> 1. Does "consumed" here  include IndexWriter.deleteDocument(...)  
>>> and IndexWriter.updateDocument(...)?
>> Yes.  Actually, just add/updateDocument.
>>> 2.  Why does it require that the  Field has been consumed first  
>>> before it can be modified? For example, I may pre-allocate in a  
>>> constructor  a Document and related fields with some default  
>>> values but do not (or cannot) add the document. Before I add my  
>>> first document, I need to set the fields with valid values.
>>> It looks like the new Lucene core would null some of the fields  
>>> if I do so.  I do not understand the logic behind the requirement.
>> That use case is fine.  You can absolutely change its value before  
>> it's consumed when the un-consumed value doesn't matter to you.   
>> That pre-allocation is a normal & fine use case, and fields should  
>> not be null'd by Lucene.
>> The warning should instead state that "if the field holds a real  
>> value, then make sure it's consumed first before you change it".
>> So, for example, if you add docs to a queue, and then use a thread  
>> pool to pull docs from that queue and call addDocument on them,  
>> you can't re-use the fields in the Document until a thread has  
>> added it to the index.
>> Mike
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> For additional commands, e-mail:
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message