Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 51620 invoked from network); 15 Jun 2005 18:29:18 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 15 Jun 2005 18:29:18 -0000 Received: (qmail 3933 invoked by uid 500); 15 Jun 2005 18:29:08 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 3904 invoked by uid 500); 15 Jun 2005 18:29:08 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 3832 invoked by uid 99); 15 Jun 2005 18:29:06 -0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received-SPF: error (hermes.apache.org: local policy) Received: from Unknown (HELO pb-exchcon2.scansoft.com) (199.4.160.64) by apache.org (qpsmtpd/0.28) with ESMTP; Wed, 15 Jun 2005 11:29:02 -0700 Received: by pb-exchcon2.pb.scansoft.com with Internet Mail Service (5.5.2658.27) id ; Wed, 15 Jun 2005 14:12:14 -0400 Message-ID: From: "Robichaud, Jean-Philippe" To: java-user@lucene.apache.org Subject: RE: Need a way to set a result limit on a particular field Date: Wed, 15 Jun 2005 14:12:38 -0400 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2658.27) Content-Type: text/plain X-Virus-Checked: Checked X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N It may be simpler and more effective to use the Hits object and keep the number of time each host was actually "returned" to the user and skip it if the limit has been reach. This way, if your users just look at the 10-20 highest hits, you will save you a lot of processing time, especially if your index is huge... Here is some pseudo code stripped from a class I once wrote Hits hits = iSearcher.search(myQuery); IntHash hostFreqCount = new IntHash(); int i=0; int j=0; while(i < hist.length) { j=0; for(; (i 3) continue; /// show the hit to the use... } } Hope it helped ! Jp -----Original Message----- From: Jay Hill [mailto:jayallenhill@gmail.com] Sent: Wednesday, June 15, 2005 2:01 PM To: java-user@lucene.apache.org Subject: Re: Need a way to set a result limit on a particular field Thanks Tony and Erik for the replies. The trick is we don't know the hosts that will be returned in advance, we just don't want more than 3 from any one host. It's not unlike searching on Google where you might see a link that says "More results from foo.com". We essentially want to discard any results > 3 for any one host. In some of our searches we might get high scores on 20 or 30 documents, but we don't want to show page after page from the same host, we'd rather limit it to 3 from each for more diversity. I may have to use a brute force approach using HitCollector as Tony suggests. I was hoping to avoid the HitCollector, but there may be no other way right now. Many thanks, -Jay On 6/14/05, Erik Hatcher wrote: > > On Jun 14, 2005, at 7:23 PM, Jay Hill wrote: > > I have a need to limit my Hits returned based on one of the indexed > > fields. This is a web application and we want to limit the number of > > hits from any one host. We have a field named "host_id" and I'd like > > to be able to limit my results to no more than three results for any > > one host_id. > > I may not be fully understanding your question, but I'll go with my > assumptions... wrap the users query into a BooleanQuery as a required > clause and then add another clause with a TermQuery for the specific > host_id. Then simply constrain the number of Hits shown to the first > 3. Does that do what you're after? > > Erik > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org