Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 62749 invoked from network); 17 May 2007 13:41:27 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 17 May 2007 13:41:27 -0000 Received: (qmail 35092 invoked by uid 500); 17 May 2007 13:41:25 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 35059 invoked by uid 500); 17 May 2007 13:41:25 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 35048 invoked by uid 99); 17 May 2007 13:41:25 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 17 May 2007 06:41:25 -0700 X-ASF-Spam-Status: No, hits=2.0 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (herse.apache.org: domain of erickerickson@gmail.com designates 66.249.92.172 as permitted sender) Received: from [66.249.92.172] (HELO ug-out-1314.google.com) (66.249.92.172) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 17 May 2007 06:41:18 -0700 Received: by ug-out-1314.google.com with SMTP id k40so284234ugc for ; Thu, 17 May 2007 06:40:56 -0700 (PDT) DKIM-Signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=Z+1z5w4dI+BomkWZo/WV7EwPlQQqUHOQCSlnSipWxz8BfoAZFlqodGau16NigvqxmxqQgq/8xupbwu6ubldR/rfEPAsh1SX+KjvxKeSY1BwVw/fLc/RuvFvyLpDpUXXWl8gxTmPErMHNgLHrqxEuiQRMdAMSJT9Ha3KfR1m979w= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=sdGjsKtK7WsdlWUL0HFZvz7ifsQ2IzdOTy7laD4YaDy2FZl2KRRSNsvQG5TORDHrAAkMdzF2QBeKeBUYhCAQ3SDgEzkEhIJEs29D+PhLC12hAGC8h6dmIFLg9l1ftjxdJW1mFvrFjpQ8zHpm2VhUIJCcQF3Is1qtlORz8h0p3+M= Received: by 10.82.126.5 with SMTP id y5mr607054buc.1179409256224; Thu, 17 May 2007 06:40:56 -0700 (PDT) Received: by 10.82.167.12 with HTTP; Thu, 17 May 2007 06:40:56 -0700 (PDT) Message-ID: <359a92830705170640u521f54f5ibbb88150a2314793@mail.gmail.com> Date: Thu, 17 May 2007 09:40:56 -0400 From: "Erick Erickson" To: java-user@lucene.apache.org Subject: Re: Field.Store.Compress - does it improve performance of document reads? In-Reply-To: <9FABF4D2-AB41-4D10-98D3-E5C02005B85E@apache.org> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_18820_21246069.1179409256171" References: <200705171001.45912.paul.elschot@xs4all.nl> <9FABF4D2-AB41-4D10-98D3-E5C02005B85E@apache.org> X-Virus-Checked: Checked by ClamAV on apache.org ------=_Part_18820_21246069.1179409256171 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline Some time ago I posted the results in my peculiar app of using FieldSelector, and it gave dramatic improvements in my case (a factor of about 10). I suspect much of that was peculiar to my index design, so your mileage may vary. See a thread titled... *Lucene 2.1, using FieldSelector speeds up my app by a factor of 10+....* Best Erick On 5/17/07, Grant Ingersoll wrote: > > I haven't tried compression either. I know there was some talk a > while ago about deprecating, but that hasn't happened. The current > implementation yields the highest level of compression. You might > find better results by compressing in your application and storing as > a binary field, thus giving you more control over CPU used. This is > our current recommendation for dealing w/ compression. > > If you are not actually displaying that field, you should look into > the FieldSelector API (via IndexReader). It allows you to lazily > load fields or skip them all together and can yield a pretty > significant savings when it comes to loading documents. > FieldSelector is available in 2.1. > > -Grant > > On May 17, 2007, at 4:01 AM, Paul Elschot wrote: > > > On Thursday 17 May 2007 08:10, Andreas Guther wrote: > >> I am currently exploring how to solve performance problems I > >> encounter with > >> Lucene document reads. > >> > >> We have amongst other fields one field (default) storing all > >> searchable > >> fields. This field can become of considerable size since we are > >> indexing > >> documents and store the content for display within results. > >> > >> I noticed that the read can be very expensive. I wonder now if it > >> would > >> make sense to add this field as Field.Store.Compress to the > >> index. Can > >> someone tell me if this would speed up the document read or if > >> this is > >> something only interesting for saving space. > > > > I have not tried the compression yet, but in my experience a good way > > to reduce the costs of document reads from a disk is by reading them > > in document number order whenever possible. In this way one saves > > on the disk head seeks. > > Compression should actually help reducing the costs of disk head seeks > > even more. > > > > Regards, > > Paul Elschot > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > > For additional commands, e-mail: java-user-help@lucene.apache.org > > > > -------------------------- > Grant Ingersoll > Center for Natural Language Processing > http://www.cnlp.org/tech/lucene.asp > > Read the Lucene Java FAQ at http://wiki.apache.org/jakarta-lucene/ > LuceneFAQ > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > ------=_Part_18820_21246069.1179409256171--