lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Carsten Schnober <>
Subject Re: Offsets in 3.6/4.0
Date Tue, 17 Jul 2012 13:01:40 GMT
Am 16.07.2012 13:07, schrieb

Dear Karsten,

> abstract of your post:
> you need the offset to perform your search/ranking like the position is needed for phrase
> You are using reader.getTermFreqVector to get the offset. 
> This is to slow for your application and you think about a switch to version 4.0

Yes, that's about it.

> imho you should using payloads.
> You also could switch to version 4 because in version you can store the offset to each
term like the position in version 3x.
> But this is basically the same as the use of payloads:
>  *
>  *

I now use payloads and this fulfils my functional requirements. I was
hoping to avoid that because I am also storing other information in the
Payload which makes it feel a bit messy; especially as it seemed
sensible to me to actually make use of the Offsets field as it already
exists. Anyway, the problem is solved so far, thank you very much!

I still wonder what the purpose of the Offset field is as it is so
inefficient to access. It seems like a wasteful redundancy to even store
the Offsets during indexing, considering that I also store it as a
payload. Or am I missing something?


Institut für Deutsche Sprache |
Projekt KorAP                 |
Tel. +49-(0)621-43740789      |
Korpusanalyseplattform der nächsten Generation
Next Generation Corpus Analysis Platform

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message