lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From haipeng du <haipen...@gmail.com>
Subject Re: limit return results
Date Wed, 07 Sep 2005 19:49:18 GMT
That is just my concern because my big number of documents I will have(at 
least 4 million documents).

On 9/7/05, Erik Hatcher <erik@ehatchersolutions.com> wrote:
> 
> Have you seen out of memory problems? Or are you being preemptive in
> your concerns?
> 
> Erik
> 
> On Sep 7, 2005, at 11:58 AM, haipeng du wrote:
> 
> > The reason that I want to limit returned result is that I do not want
> > to get out of memory problem. I index lucene with 3 million documents.
> > Sometimes, searching will return millions of fields back to me. I just
> > want to get the first 100, for example , to show them to user. Even, I
> > use search(query,filter,topDocs), I believe it still return all
> > results
> > back. So how could I limit the lucene returning?
> >
> > On 9/7/05, M.Altheim <M.Altheim@open.ac.uk> wrote:
> >
> >>
> >>
> >> Erik Hatcher [mailto:erik@ehatchersolutions.com] wrote:
> >>
> >>>
> >>> On Sep 6, 2005, at 10:47 PM, Murray Altheim wrote:
> >>>
> >>>
> >>>> Erik Hatcher wrote:
> >>>>
> >>>>
> >>>>> Just access the first 100 Hits - simple as that.
> >>>>> Erik
> >>>>>
> >>>>
> >>>> Erik,
> >>>>
> >>>> This question has come up before. For high traffic sites that
> >>>> can't afford to have the search engine accumulating thousands
> >>>> of hits, only to deliver 100, or perhaps just a few, the
> >>>> current approach *seems* like quite a lot of extra processing.
> >>>> Is there some way to have the engine simply stop generating
> >>>> the hit list after it reaches the specified threshold?
> >>>>
> >>>
> >>> The operator word here is "seems". Do you have any evidence that
> >>> doing a basic .search(Query) and only getting the first 100 results
> >>> is too slow?
> >>>
> >>> The HitCollector option that Otis mentioned is one alternative,
> >>> though I don't think it'll be much, if any, faster.
> >>>
> >>
> >> Erik,
> >>
> >> Evidence, no. I'm looking at this from the perspective of the
> >> Open University, where we have over 200,000 students accessing
> >> and searching our online services. Anything that can minimize
> >> the impact on our processors is going to be most welcome, i.e.,
> >> we don't have cycles to waste. If the student is only expecting
> >> the first 10 results and the engine generates 1000, 990 of them
> >> are wasted.
> >>
> >> Murray
> >>
> >> .....................................................................
> >> .
> >> Murray Altheim http://www.altheim.com/murray/
> >> Strategic & Service Development
> >> The Open University Library
> >> Milton Keynes, Bucks, MK7 6AA, UK
> >>
> >> Ils ont l'orteil de Bouc, & d'un Chevreil l'oreille,
> >> La corne d'un Chamois, & la face vermeille
> >> Comme un rouge Croissant: & dancent toute nuict
> >> Dedans un carrefour, ou pres d'une eau qui bruict.
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-dev-help@lucene.apache.org
> >>
> >>
> >>
> >
> >
> > --
> > Haipeng Du
> > Software Engineer
> > Comphealth,
> > Salt Lake City
> >
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
> 
> 


-- 
Haipeng Du
Software Engineer
Comphealth, 
Salt Lake City

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message