Return-Path: Delivered-To: apmail-jakarta-lucene-dev-archive@www.apache.org Received: (qmail 20612 invoked from network); 31 Aug 2004 00:26:51 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur-2.apache.org with SMTP; 31 Aug 2004 00:26:51 -0000 Received: (qmail 26399 invoked by uid 500); 31 Aug 2004 00:26:30 -0000 Delivered-To: apmail-jakarta-lucene-dev-archive@jakarta.apache.org Received: (qmail 26249 invoked by uid 500); 31 Aug 2004 00:26:27 -0000 Mailing-List: contact lucene-dev-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Developers List" Reply-To: "Lucene Developers List" Delivered-To: mailing list lucene-dev@jakarta.apache.org Received: (qmail 26107 invoked by uid 99); 31 Aug 2004 00:26:23 -0000 X-ASF-Spam-Status: No, hits=0.1 required=10.0 tests=DNS_FROM_RFC_ABUSE X-Spam-Check-By: apache.org Received-SPF: pass (hermes.apache.org: local policy) Received: from [216.136.173.240] (HELO web12703.mail.yahoo.com) (216.136.173.240) by apache.org (qpsmtpd/0.27.1) with SMTP; Mon, 30 Aug 2004 17:26:20 -0700 Message-ID: <20040831002606.2981.qmail@web12703.mail.yahoo.com> Received: from [195.29.108.183] by web12703.mail.yahoo.com via HTTP; Mon, 30 Aug 2004 17:26:06 PDT Date: Mon, 30 Aug 2004 17:26:06 -0700 (PDT) From: Otis Gospodnetic Subject: RE: Binary fields and data compression To: Lucene Developers List , rengels@ix.netcom.com In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Virus-Checked: Checked X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N --- Robert Engels wrote: ...... > ... thus my request that any compression support be optional. I think this goes without say. Say say say... Otis > -----Original Message----- > From: David Spencer [mailto:dave-lucene-dev@tropo.com] > Sent: Monday, August 30, 2004 5:33 PM > To: Lucene Developers List > Subject: Re: Binary fields and data compression > > > Robert Engels wrote: > > > The data size savings is almost certainly not worth the probable > 20-40% > > increase in CPU usage in most cases no? > > > > I think it should be optional for those who have extremely large > indices > and > > want to save some space (seems not necessary these days), and those > who > want > > to maximize performance. > > You don't know until you benchmark it, but I thought that the > heuristic > nowadays was that CPUs are fast and disk i/o is slow ( and yes, disk > space is 'infinite' :) ) - so therefore I would guess that in spite > of > the CPU cost of compression, you'd save time due to less disk i/o. > > > > > > > > -----Original Message----- > > From: Bernhard Messer [mailto:Bernhard.Messer@intrafind.de] > > Sent: Monday, August 30, 2004 4:41 PM > > To: lucene-dev@jakarta.apache.org > > Subject: Binary fields and data compression > > > > > > hi developers, > > > > a few month ago, there was a very interesting discussion about > field > > compression and the possibility to store binary field values within > a > > lucene document. Regarding to this topic, Drew Farris came up with > a > > patch to add the necessary functionality. I ran all the necessary > tests > > on his implementation and didn't find one problem. So the original > > implementation from Drew could now be enhanced to compress the > binary > > field data (maybe even the text fields if they are stored only) > before > > writing to disc. I made some simple statistical measurements using > the > > java.util.zip package for data compression. Enabling it, we could > save > > about 40% data when compressing plain text files with a size from > 1KB to > > 4KB. If there is still some interest, we could first try to update > the > > patch, because it's outdated due to several changes within the > Fields > > class. After finishing that, compression could be added to the > updated > > version of the patch. > > > > sounds good to me, what do you think ? > > > > best regards > > Bernhard > > > > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org > > For additional commands, e-mail: lucene-dev-help@jakarta.apache.org > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org > > For additional commands, e-mail: lucene-dev-help@jakarta.apache.org > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org > For additional commands, e-mail: lucene-dev-help@jakarta.apache.org > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org > For additional commands, e-mail: lucene-dev-help@jakarta.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-dev-help@jakarta.apache.org