lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bernhard Messer <Bernhard.Mes...@intrafind.de>
Subject Binary fields and data compression
Date Mon, 30 Aug 2004 21:41:10 GMT
hi developers,

a few month ago, there was a very interesting discussion about field 
compression and the possibility to store binary field values within a 
lucene document. Regarding to this topic, Drew Farris came up with a 
patch to add the necessary functionality. I ran all the necessary tests 
on his implementation and didn't find one problem. So the original 
implementation from Drew could now be enhanced to compress the binary 
field data (maybe even the text fields if they are stored only) before 
writing to disc. I made some simple statistical measurements using the 
java.util.zip package for data compression. Enabling it, we could save 
about 40% data when compressing plain text files with a size from 1KB to 
4KB. If there is still some interest, we could first try to update the 
patch, because it's outdated due to several changes within the Fields 
class. After finishing that, compression could be added to the updated 
version of the patch.

sounds good to me, what do you think ?

best regards
Bernhard




---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message