lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Lee" <davidt...@gmail.com>
Subject Re: Lucene Index Structure
Date Fri, 22 Aug 2008 00:27:29 GMT
I see, ok. Thanks to both of you!

On Thu, Aug 21, 2008 at 4:51 PM, Michael McCandless <
lucene@mikemccandless.com> wrote:

>
> Also, the inverted index *will* store positional information (in the *.prx
> files) even if term vectors are not stored.
>
> Mike
>
>
> Yonik Seeley wrote:
>
>  On Thu, Aug 21, 2008 at 7:20 PM, David Lee <davidtlee@gmail.com> wrote:
>>
>>> Clarification question:
>>>
>>> If I don't store term vectors, then I:
>>> -- won't have information on the position of matching terms
>>> -- I don't have the term frequency vector
>>>
>>> -- but I should still have the frequency of terms per document in the
>>> .frq
>>> file, right?
>>>
>>> So what's the difference between the term frequency vector and the
>>> information saved in the .frq file?
>>>
>>
>> It's how the data can be efficiently accessed... by term or by document.
>> Lucene is naturally an inverted index, and thus makes it easy to ask
>> "what documents contain this term".
>> Term vectors store the term information indexed by document and make
>> it easy to ask "what terms does this specific document have".
>>
>> -Yonik
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


-- 
David T. Lee
www.davidtlee.net

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message