Return-Path: Delivered-To: apmail-lucene-general-archive@www.apache.org Received: (qmail 82448 invoked from network); 17 Jun 2009 21:34:14 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 17 Jun 2009 21:34:14 -0000 Received: (qmail 32187 invoked by uid 500); 17 Jun 2009 21:34:25 -0000 Delivered-To: apmail-lucene-general-archive@lucene.apache.org Received: (qmail 32157 invoked by uid 500); 17 Jun 2009 21:34:25 -0000 Mailing-List: contact general-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: general@lucene.apache.org Delivered-To: mailing list general@lucene.apache.org Received: (qmail 32144 invoked by uid 99); 17 Jun 2009 21:34:25 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 17 Jun 2009 21:34:25 +0000 X-ASF-Spam-Status: No, hits=1.4 required=10.0 tests=FORGED_YAHOO_RCVD,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of lists@nabble.com designates 216.139.236.158 as permitted sender) Received: from [216.139.236.158] (HELO kuber.nabble.com) (216.139.236.158) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 17 Jun 2009 21:34:14 +0000 Received: from isper.nabble.com ([192.168.236.156]) by kuber.nabble.com with esmtp (Exim 4.63) (envelope-from ) id 1MH2lV-0004In-Ty for general@lucene.apache.org; Wed, 17 Jun 2009 14:33:53 -0700 Message-ID: <24082504.post@talk.nabble.com> Date: Wed, 17 Jun 2009 14:33:53 -0700 (PDT) From: zehua To: general@lucene.apache.org Subject: Re: Question for top term frequency In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Nabble-From: bradjoe99@yahoo.com References: <24062253.post@talk.nabble.com> X-Virus-Checked: Checked by ClamAV on apache.org Thanks for the reply. The problem is that the number of global document maybe huge, for example 10,000. If we returned all these doucments and find the top author using the term frequency loop, it can take longer time. We are considering to use CustomScoreQuery. First parameter is the normal query to match the result. Second parameter is to use the Field "Author"'s frequency to increase the score. So the results for top authors will have higher score and returned. Does it makes sense? Ted Dunning wrote: > > It is easy to get global document frequencies for all authors. > > Then it is easy to build a query that accepts documents from any of the > top > authors. > > It requires more than one query, but only a few lines of code. > > On Tue, Jun 16, 2009 at 1:30 PM, zehua wrote: > >> Is there a >> good way to do it? I searched the mailing list, and did not find a good >> match. >> > > -- View this message in context: http://www.nabble.com/Question-for-top-term-frequency-tp24062253p24082504.html Sent from the Lucene - General mailing list archive at Nabble.com.