lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Sokolov <soko...@ifactory.com>
Subject Re: storing pre-analyzed fields
Date Wed, 11 Jul 2012 10:38:15 GMT
Uwe  - thank you very much for the thorough explanation!

-Mike

On 7/11/2012 1:14 AM, Uwe Schindler wrote:
> Hi Mike,
>
> The order does not matter at all in all versions of Lucene. You also don't
> need to subclass AbstractField (but you can use e.g. NumericField as an
> example); it is enough to use new Field(name, TokenStream); if you also want
> to store this field, simply add a stored-only field with the *same* name (in
> addition to the TokenStream one).
>
> In Lucene 4.0 we are going the direction to split between the "Document"
> objects using for indexing from them returned by IndexReader/Searcher,
> because they are two different things and the latter only returning stored
> fields. But this does not affect anything here.
>
> In all Lucene versions, stored field values and indexed values are
> completely decoupled and do not relate to each other at all. Adding a Field
> in stored+indexed way is just for convenience, but you can also add it two
> times (one time as stored, one time as indexed - I prefer to always do this)
> in any order. The resulting index will be identical (don't compare files;
> there will be differences in headers!).
>
> There is one importance of order: Fields with the same name and same type
> rely on order, so two stored fields with same name are returned in same
> order by IndexReader/-Searcher, and 2 indexed fields with same name produce
> the same order for e.g. PhraseQuery or SpanQuery only, if the Field order is
> predefined. But you can interleave the Field instances for each type as you
> like.
>
> Uwe
>
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
>
>
>> -----Original Message-----
>> From: Michael Sokolov [mailto:sokolov@ifactory.com]
>> Sent: Wednesday, July 11, 2012 2:54 AM
>> To: java-user@lucene.apache.org
>> Subject: storing pre-analyzed fields
>>
>> I have a question about the API for storing and indexing lucene documents
> (in
>> 3.x).
>>
>> If I want to index a document by providing a TokenStream, I can do that by
>> calling document.add (field) where field is something I write deriving
> from
>> AbstractField that returns the TokenStream for tokenStreamValue(), and
>> nothing for stringValue() or readerValue().
>>
>> Now if I also want to store a value for that field, do I just add a
> different field
>> with different options (eg stored=true, and the field a normal Field)?
>>
>> Do these two things conflict in any way?  Do I have to be careful about
> the
>> order in which I do them?  Or is it just a mildly weird API with no
> lurking ill
>> effects? :)
>>
>> Also: I have been seeing various e-mails about changes to this API so I
> assume
>> it's all different in 4.0; if you want to take this opportunity to explain
> that,
>> please go ahead, but for now I am working with the 3.x API.
>>
>> Thanks
>>
>> -Mike Sokolov
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message