lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Muir <rcm...@gmail.com>
Subject Re: [jira] [Commented] (LUCENE-4069) Segment-level Bloom filters for a 2 x speed up on rare term searches
Date Wed, 30 May 2012 15:48:27 GMT
On Wed, May 30, 2012 at 11:43 AM, Andrzej Bialecki <ab@getopt.org> wrote:
> On 30/05/2012 17:09, Robert Muir (JIRA) wrote:
>>
>> I'm not sure this is true: e.g. if your postings format requires
>> parameters to decode the segment, then this enforces that it records said
>> parameters,
>> e.g. Pulsing records these parameters.
>>
>> Codec parameters are at index-time, at read-time its your responsibility
>> to be able to decode them solely from the index (this enforces that there
>> doesnt need
>> to be a crazy matching of user configuration at write and read time).
>
>
> I think what Mark is missing (and I saw as a limiting factor in developing
> other codecs) is to make it easier to customize Codec-s based on composition
> of reusable blocks, without necessarily needing a separate Codec class
> implementation.
>
> This could be worked around by having a "configurable codec" that stores its
> configuration and instantiates necessary reusable blocks, available using
> the SPI mechanism. On writing you could specify this configuration as Codec
> attributes, and they could be written out e.g. to SegmentInfos, and on read
> they would become available from SegmentInfos.attributes.
>

Well I think honestly here a bug in PerFieldPostingsFormat is
definitely confusing the situation (LUCENE-4090).

You should be able to set Pulsing(1) on "id" field and Pulsing(2) on
"date" field and everything just work: but I broke that. I think thats
whats causing the most grief.

-- 
lucidimagination.com

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message