lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vitaly Funstein <>
Subject Re: Toggling compression for stored fields
Date Wed, 15 May 2013 21:36:16 GMT
Thanks for the quick reply, this is certainly good news. So just to clarify
- doing a manual segment merge is optional when changing codecs, correct? I
mean, I can just restart my application with a new codec config and let the
regular, background merging task do the work of eventually converting all
the data to the new format?

On Wed, May 15, 2013 at 2:30 PM, Uwe Schindler <> wrote:

> Hi Vitaly,
> what you call an "index" is just a collection (a CompositeReader) of
> atomic readers. They can be mixed regarding compression, just like you
> could have a MultiReader with different indexes using different codecs.
> Every atomic segment of an index can only have one stored fields format.
> Once merging occurs, the uncompressed fields of e.g. an older atomic
> segment gets merged into a new segment with compression enabled. The same
> can happen in the other direction. The codec is responsible for encoding
> the data on disk and this includes the compression. When merging segments,
> the data is uncompressed and recompressed as needed. To improve
> performance, there are shortcuts to copy the data directly if the codec
> does not change while merging.
> With Lucene 4.x, you are free to open an IndexWriter with a different
> codec configuration and e.g. use IndexUpgrader or do a force merge manually
> to merge all "old" segments and "recompress" them to a different codec
> config. This has nothing to do with "reindexing" as you are just changing
> the encoding of the exact same data on disk.
> Uwe
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> eMail:
> > -----Original Message-----
> > From: Vitaly Funstein []
> > Sent: Wednesday, May 15, 2013 10:38 PM
> > To:
> > Subject: Toggling compression for stored fields
> >
> > Is it possible to have a mix of compressed and uncompressed documents
> > within a single index? That is, can I load an index created with Lucene
> 4.0 into
> > 4.1 and defer the decision of whether or not to use
> > CompressingStoredFieldsFormat until a later time, or even go back and
> forth
> > between compressed and uncompressed codecs, if needed? I thought at
> > first the answer would be an unequivocal "no", but then how would one
> > migrate data from 4.0 to 4.1 without a full reindex?
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message