lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Performance of "rows" and "start" parameters
Date Mon, 04 Nov 2013 16:52:24 GMT
bq: start=0&rows=30

Let's see the start and rows parameters for a few of
your queries, because on the surface this makes
no sense. If you're always starting at 0, this
shouldn't be happening....

And you say "the second query is visibly slower". You're
talking about the "deep paging" problem, which you shouldn't
notice until your start parameter is at least up in the
thousands, perhaps 10s of thousands.

So unless you're incrementing the start parameter way up
there, there's something else going on.....

You should be seeing this reflected in your QTimes BTW, if
not then you're seeing something else, perhaps just
too much happening on the box...

FWIW,
Erick


On Mon, Nov 4, 2013 at 11:01 AM, Michael Della Bitta <
michael.della.bitta@appinions.com> wrote:

> The query time increases because in order to calculate the set of documents
> that belongs in page N, you must first calculate all the pages prior to
> page N, and this information is not stored in between requests.
>
> Two ways of speeding this stuff up are to request bigger pages, and/or use
> filter queries over some sort of orderable field in your index to do the
> paging. So for example, if you have a timestamp field in your index, and
> your data represents 100 days, doing 100 queries, one for each day, is much
> better than doing 100 queries using start/rows.
>
>
> Michael Della Bitta
>
> Applications Developer
>
> o: +1 646 532 3062  | c: +1 917 477 7906
>
> appinions inc.
>
> “The Science of Influence Marketing”
>
> 18 East 41st Street
>
> New York, NY 10017
>
> t: @appinions <https://twitter.com/Appinions> | g+:
> plus.google.com/appinions<
> https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
> >
> w: appinions.com <http://www.appinions.com/>
>
>
> On Mon, Nov 4, 2013 at 8:43 AM, michael.boom <my_sky_mc@yahoo.com> wrote:
>
> > I saw that some time ago there was a JIRA ticket dicussing this, but
> still
> > i
> > found no relevant information on how to deal with it.
> >
> > When working with big nr of docs (e.g. 70M) in my case, I'm using
> > start=0&rows=30 in my requests.
> > For the first req the query time is ok, the next one is visibily slower,
> > the
> > third even more slow and so on until i get some huge query times of up
> > 140secs, after a few hundreds requests. My test were done with SolrMeter
> at
> > a rate of 1000qpm. Same thing happens at 100qpm, tough.
> >
> > Is there a best practice on how to do in this situation, or maybe an
> > explanation why is the query time increasing, from request to request ?
> >
> > Thanks!
> >
> >
> >
> > -----
> > Thanks,
> > Michael
> > --
> > View this message in context:
> >
> http://lucene.472066.n3.nabble.com/Performance-of-rows-and-start-parameters-tp4099194.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message