lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: Omit positions but not TF
Date Sun, 08 Nov 2009 10:35:38 GMT
+1

I guess we'd add a Fieldable.setOmitPositions?  And then save that in
FieldInfos, and fix the postings writing/reading to respect it?  Ie,
we can just change the index format.  Encoding as negative numbers
isn't great because the termFreq is written as a vInt, which consumes
5 bytes to encode any negative number.  Wanna cough up a patch?
Probably this should wait until 3.1.

Mike

On Sat, Nov 7, 2009 at 7:47 PM, Andrzej Bialecki <ab@getopt.org> wrote:
> Hi,
>
> During one of discussions at ApacheCon it occurred to me that it would be
> useful to have an option to discard positional information but still keep
> the term frequency. Even though position-dependent queries wouldn't work
> then, still any other queries would work fine and we would get the right
> scoring.
>
> I believe it should be possible to do this without changing the file format,
> if we used a negative term frequency for terms without postings - we would
> have to check for that condition in SegmentTermDocs, change the flags there
> and flip the sign of docFreq. And eventually we may want to add a separate
> flag for this and bump the format version.
>
> --
> Best regards,
> Andrzej Bialecki     <><
>  ___. ___ ___ ___ _ _   __________________________________
> [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
> ___|||__||  \|  ||  |  Embedded Unix, System Integration
> http://www.sigram.com  Contact: info at sigram dot com
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message