lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler" <...@thetaphi.de>
Subject RE: Toggling compression for stored fields
Date Wed, 15 May 2013 21:30:09 GMT
Hi Vitaly,

what you call an "index" is just a collection (a CompositeReader) of atomic readers. They
can be mixed regarding compression, just like you could have a MultiReader with different
indexes using different codecs. Every atomic segment of an index can only have one stored
fields format. Once merging occurs, the uncompressed fields of e.g. an older atomic segment
gets merged into a new segment with compression enabled. The same can happen in the other
direction. The codec is responsible for encoding the data on disk and this includes the compression.
When merging segments, the data is uncompressed and recompressed as needed. To improve performance,
there are shortcuts to copy the data directly if the codec does not change while merging.

With Lucene 4.x, you are free to open an IndexWriter with a different codec configuration
and e.g. use IndexUpgrader or do a force merge manually to merge all "old" segments and "recompress"
them to a different codec config. This has nothing to do with "reindexing" as you are just
changing the encoding of the exact same data on disk.

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Vitaly Funstein [mailto:vfunstein@gmail.com]
> Sent: Wednesday, May 15, 2013 10:38 PM
> To: java-user@lucene.apache.org
> Subject: Toggling compression for stored fields
> 
> Is it possible to have a mix of compressed and uncompressed documents
> within a single index? That is, can I load an index created with Lucene 4.0 into
> 4.1 and defer the decision of whether or not to use
> CompressingStoredFieldsFormat until a later time, or even go back and forth
> between compressed and uncompressed codecs, if needed? I thought at
> first the answer would be an unequivocal "no", but then how would one
> migrate data from 4.0 to 4.1 without a full reindex?


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message