lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: Field Question
Date Mon, 25 Aug 2008 09:22:46 GMT

I think you meant Field.Index.NO and Field.Index.TOKENIZED, for those  
two docs.

The answer is yes -- Lucene considers the field "indexed" if ever any  
doc, even a single doc, had set Index.TOKENIZED or Index.UN_TOKENIZED  
for that field.

However, your document A still will not have been indexed.

Lucene has no global schema; it allows individual docs to have  
whatever fields, with whatever options, they want while indexing.   
However, when documents are indexed together into a segment, some  
properties, eg isIndexed, omitNorms, storeTermVectors, etc, are stored  
globally in the FieldInfos (*.fnm files) and thus must be "merged".   
EG, with omitNorms, only if every doc in the segment called omitNorms  
for field "foo" will norms actually not be stored.

Mike

DimitriD wrote:

>
> John,
>
> Thanks for answering.
> So if I have a document "A "with field "foo" and the field has  
> attribute
> Field.NO and
> I have document "B" with field "foo" and the field is Field.TOKENIZED.
> Will IndexReader.getFieldNames(IndexReader.FieldOption.INDEXED)  
> return field
> "foo"?
>
> Thanks,
>
> Dimitri
>
>
> John Griffin-3 wrote:
>>
>> Dimitri,
>>
>> 	Field.TOKENIZED and Field.NO_NORMs send their field's contents
>> through a tokenizer and make their contents indexed and therefore
>> searchable. FIELD.UN_TOKENIZED does not send its field's contents  
>> through
>> a
>> tokenizer but it still indexes its contents. Only Field.NO does not  
>> index
>> its field's contents.
>> 	
>> 	So, regardless of whether or not a particular document has a field
>> tokenized while another document does not has nothing to do with  
>> whether
>> or
>> not the contents are indexed. As long as it's not Field.NO, they are.
>>
>> John G.
>>
>> -----Original Message-----
>> From: DimitriD [mailto:ddogadin@outstart.com]
>> Sent: Friday, August 22, 2008 9:14 AM
>> To: java-user@lucene.apache.org
>> Subject: Field Question
>>
>>
>> I am new to lucene. Here is my question. The document has fields.  
>> When I
>> add
>> a field to the document I can specify that field is Indexed,  
>> Tokenized,
>> etc.. So the same field can be Tokenized in one document and be
>> not-tokenized in another document. However the is a method
>> IndexReader.getFieldNames(IndexReader.FieldOption.INDEXED) that  
>> returns
>> all
>> index fields in the index. It seems like it assumes that FIELD  
>> should have
>> the same attributes across all documents.
>> Can anyoen explain it?
>>
>> -- 
>> View this message in context:
>> http://www.nabble.com/Field-Question-tp19108787p19108787.html
>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>>
>
> -- 
> View this message in context: http://www.nabble.com/RE%3A-Field-Question-tp19116394p19136508.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message