Return-Path: Delivered-To: apmail-jakarta-lucene-user-archive@www.apache.org Received: (qmail 9564 invoked from network); 10 Sep 2004 23:09:50 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur-2.apache.org with SMTP; 10 Sep 2004 23:09:50 -0000 Received: (qmail 77288 invoked by uid 500); 10 Sep 2004 23:09:21 -0000 Delivered-To: apmail-jakarta-lucene-user-archive@jakarta.apache.org Received: (qmail 77214 invoked by uid 500); 10 Sep 2004 23:09:19 -0000 Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Users List" Reply-To: "Lucene Users List" Delivered-To: mailing list lucene-user@jakarta.apache.org Received: (qmail 77113 invoked by uid 99); 10 Sep 2004 23:09:18 -0000 X-ASF-Spam-Status: No, hits=0.1 required=10.0 tests=DNS_FROM_RFC_ABUSE X-Spam-Check-By: apache.org Received-SPF: pass (hermes.apache.org: local policy) Received: from [216.136.173.238] (HELO web12701.mail.yahoo.com) (216.136.173.238) by apache.org (qpsmtpd/0.28) with SMTP; Fri, 10 Sep 2004 16:09:13 -0700 Message-ID: <20040910230231.53274.qmail@web12701.mail.yahoo.com> Received: from [195.29.103.60] by web12701.mail.yahoo.com via HTTP; Fri, 10 Sep 2004 16:02:31 PDT Date: Fri, 10 Sep 2004 16:02:31 -0700 (PDT) From: Otis Gospodnetic Subject: Re: question on Hits.doc To: Lucene Users List In-Reply-To: <20040910221932.M34085@giant-pandas.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Virus-Checked: Checked X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N Hello Roy, This sounds normal. When you pull a Document from Hits, you are really pulling it from the disk. All fields are read from disk at that time (i.e. no lazy loading of fields), so if you have large text fields, this is going to result in a lot of disk IO. You could try running vmstat or sar (I'm assuming you are using a UNIX flavour) and look at the bi/bo (really just bo) column (bo = blocks out -- data read from disks). There is not much you can do. If you don't have to store the field, they will probably help. Some people are working on adding support for field compression, so maybe that will help. Otis --- roy-lucene-user@xemaps.com wrote: > Hey guys, > > We were noticing some speed problems on our searches and after adding > some > debug statements to the lucene source code, we have determined that > the > Hits.doc(x) is the problem. (BTW, we are using Lucene 1.2 [with > plans to > upgrade]). It seems that retrieving the actual Document from the > search is > very slow. > > We think it might be our "Message" field which stores a huge amount > of text. > We are currently running a test in which we won't "store" the > "Message" field, > however, I was wondering if any of you guys would know if that would > be the > reason why we're having the performance problems? If so, could > anyone also > please explain it? It seemed that we weren't having these > performance > problems before. Has anyone else experienced this? Our environment > is NT 4, > JDK 1.4.2, and PIIIs. > > I know that for large text fields, storing the field is not a good > practice, > however, it held certain conveniences for us that I hope to not get > rid of. > > Roy. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org > For additional commands, e-mail: lucene-user-help@jakarta.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-user-help@jakarta.apache.org