lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler" <>
Subject RE: If you could have one feature in Lucene...
Date Sat, 27 Feb 2010 15:17:37 GMT
Hi Glen,

> Pluggable compression allowing for alternatives to gzip for text
> compression for storing.
> Specifically I am interested in bzip2[1] as implemented in Apache
> Commons Compress[2].
> While bzip2 compression is considerable slower than gzip (although
> decompression is not too much slower than gzip) it compresses much
> better than gzip (especially text).
> Having the choice would be helpful, and for Lucene usage for non-text
> indexing, content specific compression algorithms may outperform the
> default gzip.

Since Version 3.0 / 2.9 of Lucene compression support was removed entirely (in 2.9 still avail
as deprecated). All you now have to do is simply store your compressed stored fields as a
byte[] (see Field javadocs). By that you can use any compression. The problems with gzip and
the other available compression algos lead us to removing the compression support from Lucene
(as it had lots of problems). In general the way to go is: Create a ByteArrayOutputStream
and wrap with any compression filter, then feed your data in and use "new Field(name,stream.getBytes())".
On the client side just use the inverse (Document.getBinaryValue(), create input stream on
top of byte[] and decompress).


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message