Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 3561 invoked from network); 24 May 2007 14:52:04 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 24 May 2007 14:52:04 -0000 Received: (qmail 90287 invoked by uid 500); 24 May 2007 14:51:40 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 90240 invoked by uid 500); 24 May 2007 14:51:40 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 90200 invoked by uid 99); 24 May 2007 14:51:40 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 24 May 2007 07:51:40 -0700 X-ASF-Spam-Status: No, hits=2.0 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (herse.apache.org: domain of carlosjosepita@gmail.com designates 209.85.132.241 as permitted sender) Received: from [209.85.132.241] (HELO an-out-0708.google.com) (209.85.132.241) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 24 May 2007 07:51:32 -0700 Received: by an-out-0708.google.com with SMTP id b20so57109ana for ; Thu, 24 May 2007 07:51:07 -0700 (PDT) DKIM-Signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=W+0cofdtka3B6d4UpLH36aCj9P1utbv2Sh3fRBWKpoako34Z9CxCWL0rihQXNblbxvxQH13pfLrXplf+egyS/NMW8Y59XkQfu5ZIz5nySXzxRQM7682B9AxJ7YSs4rsKeq31RJ4yOSx8cTpPLlzz8PqOZmup2wOJAKVxX/a9ovw= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=PDVXNjAtdlC4Dr1B4fVURf5f8Yb6Ntec8XzAawR4jFfMQVgbgeVNbdjOggUsf/JQhfEzRoUyFGYy9Pp7vY3bndLyw6zAVVaHWIus/2VfnTW7FIQxZHwpKMWTUHlerfof9Ar4vcbHeHHIZfo/vU4CceYkQAGmlH9V4KBy/9bhQus= Received: by 10.114.184.16 with SMTP id h16mr933892waf.1180018265684; Thu, 24 May 2007 07:51:05 -0700 (PDT) Received: by 10.114.14.6 with HTTP; Thu, 24 May 2007 07:51:05 -0700 (PDT) Message-ID: <7798eaa0705240751j372ffb0bpdd77b830d289e6cf@mail.gmail.com> Date: Thu, 24 May 2007 11:51:05 -0300 From: "Carlos Pita" To: java-user@lucene.apache.org Subject: Re: HitCollector or Hits In-Reply-To: <359a92830705240620n648fe900yeba4ff40ae54ee91@mail.gmail.com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_143310_19290337.1180018265645" References: <7798eaa0705232030l25d3c4f0ja56efff2994a0a12@mail.gmail.com> <359a92830705240620n648fe900yeba4ff40ae54ee91@mail.gmail.com> X-Virus-Checked: Checked by ClamAV on apache.org ------=_Part_143310_19290337.1180018265645 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline Hi Erick, thank you for your prompt answer. What do you mean by loading the document? Accessing one of the stored fields? In that case I'm afraid I would need to do it. For example, in the aforementioned case of a result of products, I have to look at any product store_id, which is stored along the document. Is this a performance killer? Maybe I should keep some tables in memory, for example an array mapping from id to store_id in O(1). I will do some benchmarking before anyway. Cheers, Carlos On 5/24/07, Erick Erickson wrote: > > I know of no way to alter the Hits behavior, I recommend using > a TopDocs/TopDocCollector. > > But be aware that if you load the document for each one, you may incur > a significant penalty, although the lazy-loading helped me a lot, see > FieldSelector..... > > On 5/23/07, Carlos Pita wrote: > > > > Hi folks, > > > > I need to collect some global information from my first 1000 search > > results > > in order to build up some search refining components containing only > > relevant values (those which correspond to at least one of the first > 1000 > > hits). For example, the results are products and there is a store filter > > component that shows only the stores that sells a product between the > > first > > 1000 hits. So even if the user sees just the first 20, I would have to > > inspect the first 1000. I've read that Hits mantains a cache of about > 100 > > or > > 200 hits. Is this configurable? If I could set this cache to 1000 I > would > > then use Hits to browse the search results. Another way, I should use > > HitCollector. What's your advice? > > > > TIA > > Cheers, > > Carlos > > > ------=_Part_143310_19290337.1180018265645--