Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 16096 invoked from network); 14 Mar 2007 18:15:38 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 14 Mar 2007 18:15:38 -0000 Received: (qmail 89763 invoked by uid 500); 14 Mar 2007 18:15:39 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 89727 invoked by uid 500); 14 Mar 2007 18:15:39 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 89716 invoked by uid 99); 14 Mar 2007 18:15:39 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 14 Mar 2007 11:15:39 -0700 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received-SPF: neutral (herse.apache.org: local policy) Received: from [169.229.70.167] (HELO rescomp.berkeley.edu) (169.229.70.167) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 14 Mar 2007 11:15:28 -0700 Received: by rescomp.berkeley.edu (Postfix, from userid 1007) id 0B4665B763; Wed, 14 Mar 2007 11:15:06 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by rescomp.berkeley.edu (Postfix) with ESMTP id 04E067F403 for ; Wed, 14 Mar 2007 11:15:06 -0700 (PDT) Date: Wed, 14 Mar 2007 11:15:06 -0700 (PDT) From: Chris Hostetter To: java-user@lucene.apache.org Subject: Re: Performance between Filter and HitCollector? In-Reply-To: <45F62B04.1050702@teamware.com> Message-ID: References: <45F62B04.1050702@teamware.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Checked: Checked by ClamAV on apache.org it's kind of an Apples/Oranges comparison .. in the examples you gave below, one is executing an arbitrary query (which oculd be anything) the other is doing a simple TermEnumeration. Asuming that Query is a TermQuery, the Filter is theoreticaly going to be faster becuase it does't have to compute any Scores ... generally speaking a a Filter will alwyas be a little faster then a functionally equivilent Query for the purposes of building up a simple BitSet of matching documents because teh Query involves the score calcuations ... but the Query is generally more usable. The Query can also be more efficient in other ways, because the HitCollector doesn't *have* to build a BitSet, it can deal with the results in whatever way it wants (where as a Filter allways generates a BitSet). Solr goes the HitCollector route for a few reasons: 1) allows us to use hte DocSet abstraction which allows other performance benefits over straight BitSets 2) allows us to have simpler code that builds DocSets and DocLists (DocLists know about scores, sorting, and pagination) in a single pass when scores or sorting are requested. -Hoss --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org