lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Miller <markrmil...@gmail.com>
Subject Re: Omitting TermVector info and index size
Date Wed, 14 Feb 2007 14:41:34 GMT
As Erick said, Term positions are kept regardless of whether you store 
term vectors. The positional information is needed for phrase queries, 
span queries, etc. You certainly don't lose the ability to use phrase 
queries if you do not store term vectors. If you check out the Posting 
class in DocumentWriter you can trace around and see how positions are 
stored at the same basic level as the term that is stored. Positions are 
first class citizens <g>.

- Mark

Erick Erickson wrote:
> Erik Hatcher sez no.
>
> Erick
>
> On 2/14/07, karl wettin <karl.wettin@gmail.com> wrote:
>>
>>
>> 14 feb 2007 kl. 15.03 skrev Erick Erickson:
>>
>> > My reasoning was that I do need position information since I need
>> > to do Span
>> > queries,  but character information (WITH_OFFSETS) isn't necessary
>> > here/now.
>> > So I thought I'd make a small test to see if this was worth
>> > pursuing. If
>> > omitting offsets had only saved me 10%, for instance, I wouldn't
>> > pursue it
>> > very much. But 75+% is a savings well worth pursuing.
>>
>> Spans don't rely on the TermPositionVector, do they?
>>
>> I always thought that the data from IndexReader#termPositions came
>> from some
>> other source. Or so.
>>
>> -- 
>> karl
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message