lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Adrien Grand (JIRA)" <>
Subject [jira] [Commented] (LUCENE-4558) Make CompressingStoredFieldsFormat more flexible
Date Thu, 15 Nov 2012 01:16:12 GMT


Adrien Grand commented on LUCENE-4558:

bq. so the compression format byte is replaced by string you passed in the codec header...

Right: with this patch, a concrete CompressingStoredFieldsFormat must always use the same
compression format. Compared to trunk, this means that if you want to change the compression
format, you must either create a stored fields format with a different name or bump the version
number. But we are still free to perform modifications that don't change the compression format,
such as modifying the compression algorithm to spend more (less) time compressing in order
to improve the compression ratio (speed).

bq. But I think it was the right tradeoff to do: compare 4.1's format versus 4.0, its easier
that they are separate codecs i think.

Agreed. Having lots of if/then/else would have made the code less readable given how different
these stored fields formats are.

bq. At some point in the future, Lucene40 and Lucene41 codec will be too old and we are going
to need hacks to throw IndexFormatTooOldException and so on, so we are already in trouble

Why would we need hacks? Wouldn't it be sufficient to register Lucene40 and Lucene41 with
a codec impl whose *writers would throw UnsupportedOperationException and *readers would throw
IndexFormatTooOldException? (or is it a hack to you?)

bq. sorry i havent fully thought about it and I'm not listing objections, just thinking to
be careful

No problem, I'm glad you did. This is the right time to think about backward compatibility...
> Make CompressingStoredFieldsFormat more flexible
> ------------------------------------------------
>                 Key: LUCENE-4558
>                 URL:
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Assignee: Adrien Grand
>            Priority: Minor
>         Attachments: LUCENE-4558.patch
> My plan consists in making CompressionMode an abstract class instead of an enum and having
different codec names per CompressionMode. I think this has two main benefits:
>  - it makes Lucene41StoredFieldsFormat cleaner (no need to write a CompressionMode id),
>  - it allows for custom CompressionModes.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message