lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Re: Binary fields and data compression
Date Mon, 30 Aug 2004 21:57:55 GMT
Bernhard,

Sounds good to me.
I would, however, also be interested in the performance impact of
text-field compression.  While adapting Drew's patch, it may be nice to
make the compression mechanism pluggable.

Otis

--- Bernhard Messer <Bernhard.Messer@intrafind.de> wrote:

> hi developers,
> 
> a few month ago, there was a very interesting discussion about field 
> compression and the possibility to store binary field values within a
> 
> lucene document. Regarding to this topic, Drew Farris came up with a 
> patch to add the necessary functionality. I ran all the necessary
> tests 
> on his implementation and didn't find one problem. So the original 
> implementation from Drew could now be enhanced to compress the binary
> 
> field data (maybe even the text fields if they are stored only)
> before 
> writing to disc. I made some simple statistical measurements using
> the 
> java.util.zip package for data compression. Enabling it, we could
> save 
> about 40% data when compressing plain text files with a size from 1KB
> to 
> 4KB. If there is still some interest, we could first try to update
> the 
> patch, because it's outdated due to several changes within the Fields
> 
> class. After finishing that, compression could be added to the
> updated 
> version of the patch.
> 
> sounds good to me, what do you think ?
> 
> best regards
> Bernhard
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message