lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrzej Bialecki>
Subject Re: [jira] [Commented] (LUCENE-4069) Segment-level Bloom filters for a 2 x speed up on rare term searches
Date Wed, 30 May 2012 15:43:47 GMT
On 30/05/2012 17:09, Robert Muir (JIRA) wrote:
> I'm not sure this is true: e.g. if your postings format requires parameters to decode
the segment, then this enforces that it records said parameters,
> e.g. Pulsing records these parameters.
> Codec parameters are at index-time, at read-time its your responsibility to be able to
decode them solely from the index (this enforces that there doesnt need
> to be a crazy matching of user configuration at write and read time).

I think what Mark is missing (and I saw as a limiting factor in 
developing other codecs) is to make it easier to customize Codec-s based 
on composition of reusable blocks, without necessarily needing a 
separate Codec class implementation.

This could be worked around by having a "configurable codec" that stores 
its configuration and instantiates necessary reusable blocks, available 
using the SPI mechanism. On writing you could specify this configuration 
as Codec attributes, and they could be written out e.g. to SegmentInfos, 
and on read they would become available from SegmentInfos.attributes.

Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration  Contact: info at sigram dot com

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message