lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jay Hill <jayallenh...@gmail.com>
Subject Re: Need a way to set a result limit on a particular field
Date Thu, 16 Jun 2005 22:52:15 GMT
Thanks Richard, I'll check it out.

-Jay

On 6/16/05, Richard Krenek <richard.krenek@gmail.com> wrote:
> To add to this option, you may want to use this patch
> http://issues.apache.org/bugzilla/show_bug.cgi?id=27743
> This way instead of pulling the entire document back each time, just
> pull back your host field. Then do your check and only pull pack the
> rest of the document if you need to. This will help with speed if you
> are going through a lot of documents and each document is big.
> 
> On 6/15/05, Jay Hill <jayallenhill@gmail.com> wrote:
> > I like this approach. This may be what I'm looking for.
> >
> > Thanks JP!
> > -Jay
> >
> > On 6/15/05, Robichaud, Jean-Philippe
> > <Jean-Philippe.Robichaud@scansoft.com> wrote:
> > >
> > > It may be simpler and more effective to use the Hits object and keep the
> > > number of time each host was actually "returned" to the user and skip it if
> > > the limit has been reach.  This way, if your users just look at the 10-20
> > > highest hits, you will save you a lot of processing time, especially if your
> > > index is huge...
> > >
> > > Here is some pseudo code stripped from a class I once wrote
> > >
> > >
> > > Hits hits = iSearcher.search(myQuery);
> > > IntHash hostFreqCount = new IntHash();
> > >
> > > int i=0;
> > > int j=0;
> > >
> > > while(i < hist.length) {
> > >  j=0;
> > >  for(; (i<hits.length && j < 10); i++,j++) {
> > >
> > >   Document doc = iSearcher.doc(hits.doc(i));
> > >   String host_id = doc.get("host_id");
> > >   hostFreqCount.inc(host_id);
> > >
> > >    if(hostFreqCount.get(host_id) > 3) continue;
> > >
> > >   ///  show the hit to the use...
> > >
> > >  }
> > > }
> > >
> > >
> > > Hope it helped !
> > >
> > > Jp
> > >
> > >
> > > -----Original Message-----
> > > From: Jay Hill [mailto:jayallenhill@gmail.com]
> > > Sent: Wednesday, June 15, 2005 2:01 PM
> > > To: java-user@lucene.apache.org
> > > Subject: Re: Need a way to set a result limit on a particular field
> > >
> > > Thanks Tony and Erik for the replies. The trick is we don't know the
> > > hosts that will be returned in advance, we just don't want more than 3
> > > from any one host. It's not unlike searching on Google where you might
> > > see a link that says "More results from foo.com". We essentially want
> > > to discard any results > 3 for any one host. In some of our searches
> > > we might get high scores on 20 or 30 documents, but we don't want to
> > > show page after page from the same host, we'd rather limit it to 3
> > > from each for more diversity.
> > >
> > > I may have to use a brute force approach using HitCollector as Tony
> > > suggests. I was hoping to avoid the HitCollector, but there may be no
> > > other way right now.
> > >
> > > Many thanks,
> > > -Jay
> > >
> > >
> > > On 6/14/05, Erik Hatcher <erik@ehatchersolutions.com> wrote:
> > > >
> > > > On Jun 14, 2005, at 7:23 PM, Jay Hill wrote:
> > > > > I have a need to limit my Hits returned based on one of the indexed
> > > > > fields. This is a web application and we want to limit the number
of
> > > > > hits from any one host. We have a field named "host_id" and I'd like
> > > > > to be able to limit my results to no more than three results for
any
> > > > > one host_id.
> > > >
> > > > I may not be fully understanding your question, but I'll go with my
> > > > assumptions... wrap the users query into a BooleanQuery as a required
> > > > clause and then add another clause with a TermQuery for the specific
> > > > host_id.  Then simply constrain the number of Hits shown to the first
> > > > 3.  Does that do what you're after?
> > > >
> > > >      Erik
> > > >
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > > >
> > > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > >
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message