Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 9340 invoked from network); 5 Oct 2005 21:32:29 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 5 Oct 2005 21:32:29 -0000 Received: (qmail 86366 invoked by uid 500); 5 Oct 2005 21:32:25 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 86335 invoked by uid 500); 5 Oct 2005 21:32:25 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 86324 invoked by uid 99); 5 Oct 2005 21:32:25 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 05 Oct 2005 14:32:25 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: domain of jjl@panix.com designates 166.84.1.74 as permitted sender) Received: from [166.84.1.74] (HELO mail3.panix.com) (166.84.1.74) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 05 Oct 2005 14:32:28 -0700 Received: from mailspool3.panix.com (mailspool3.panix.com [166.84.1.78]) by mail3.panix.com (Postfix) with ESMTP id 5E5AD13A867 for ; Wed, 5 Oct 2005 17:32:00 -0400 (EDT) Received: from [10.0.1.6] (pool-70-109-206-139.prov.east.verizon.net [70.109.206.139]) by mailspool3.panix.com (Postfix) with ESMTP id 0A4D53607E2 for ; Wed, 5 Oct 2005 17:31:58 -0400 (EDT) Mime-Version: 1.0 Message-Id: In-Reply-To: <03a301c5c9e2$accdfda0$0802a8c0@ME> References: <92A278501EF50444AD3EE1FFB51219550BCE3989@emss10m03.rck.atm.lmco.com> <03a301c5c9e2$accdfda0$0802a8c0@ME> Date: Wed, 5 Oct 2005 17:31:48 -0400 To: java-user@lucene.apache.org From: "J.J. Larrea" Subject: Re: What is a Hits object? Content-Type: text/plain; charset="us-ascii" X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N A Hits object is essentially a cache on query results. It caches in 2 ways: 1. When a query returning Hits is requested, only the top 100 document IDs and scores are requested from the scoring system, and the ID/Score pairs are stored in a list in the Hits object. Whenever a document ID, score, or Document object are requested that lie beyond the end of that list, the query is reexecuted in order to grow the list to at or beyond the request, typically 100% beyond it. 2. Returning a Document object (rather than a score or document ID) requires reconstituting the Document from the stored fields in the index, which is an expensive operation. The Hits object maintains a cache of the 200 most recently requested Document objects, so it is unlikely they will need to be reconstituted more than once. This is all optimized around typical hitlist access patterns - navigate forward and backwards through the results pages a small number of documents at a time. For applications which cannot benefit from the Hits caching, for example which employ their own hit caching layer, one can effectively use the so-called "low level" IndexSearcher routines which return TopDocs rather than Hits. - J.J. At 8:26 PM +0100 10/5/05, Cyril Barlow wrote: >Is it an actual array of full Documents or a list of reference points to >Documents? And what's the typical size in memory of a Hits object with say >1000 avg size docs? --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org