lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject RE: Binary fields and data compression
Date Tue, 31 Aug 2004 00:26:06 GMT

--- Robert Engels <rengels@ix.netcom.com> wrote:

......

> ... thus my request that any compression support be optional.

I think this goes without say.  Say say say...

Otis


> -----Original Message-----
> From: David Spencer [mailto:dave-lucene-dev@tropo.com]
> Sent: Monday, August 30, 2004 5:33 PM
> To: Lucene Developers List
> Subject: Re: Binary fields and data compression
> 
> 
> Robert Engels wrote:
> 
> > The data size savings is almost certainly not worth the probable
> 20-40%
> > increase in CPU usage in most cases no?
> >
> > I think it should be optional for those who have extremely large
> indices
> and
> > want to save some space (seems not necessary these days), and those
> who
> want
> > to maximize performance.
> 
> You don't know until you benchmark it, but I thought that the
> heuristic
> nowadays was that CPUs are fast and disk i/o is slow ( and yes, disk
> space is 'infinite' :) ) - so therefore I would guess that in spite
> of
> the CPU cost of compression, you'd save time due to less disk i/o.
> 
> 
> >
> >
> > -----Original Message-----
> > From: Bernhard Messer [mailto:Bernhard.Messer@intrafind.de]
> > Sent: Monday, August 30, 2004 4:41 PM
> > To: lucene-dev@jakarta.apache.org
> > Subject: Binary fields and data compression
> >
> >
> > hi developers,
> >
> > a few month ago, there was a very interesting discussion about
> field
> > compression and the possibility to store binary field values within
> a
> > lucene document. Regarding to this topic, Drew Farris came up with
> a
> > patch to add the necessary functionality. I ran all the necessary
> tests
> > on his implementation and didn't find one problem. So the original
> > implementation from Drew could now be enhanced to compress the
> binary
> > field data (maybe even the text fields if they are stored only)
> before
> > writing to disc. I made some simple statistical measurements using
> the
> > java.util.zip package for data compression. Enabling it, we could
> save
> > about 40% data when compressing plain text files with a size from
> 1KB to
> > 4KB. If there is still some interest, we could first try to update
> the
> > patch, because it's outdated due to several changes within the
> Fields
> > class. After finishing that, compression could be added to the
> updated
> > version of the patch.
> >
> > sounds good to me, what do you think ?
> >
> > best regards
> > Bernhard
> >
> >
> >
> >
> >
> ---------------------------------------------------------------------
> > To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
> >
> >
> >
> ---------------------------------------------------------------------
> > To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
> >
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message