lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yonik Seeley <yo...@heliosearch.com>
Subject Re: Is the act of *caching* an fq very expensive? (seems to cost 4 seconds in my example)
Date Wed, 04 Jun 2014 02:17:26 GMT
On Tue, Jun 3, 2014 at 9:48 PM, Brett Hoerner <brett@bretthoerner.com> wrote:
> Yonik, I'm familiar with your blog posts -- and thanks very much for them.
> :) Though I'm not sure what you're trying to show me with the q=*:* part? I
> was of course using q=*:* in my queries, but I assume you mean to leave off
> the text:lol bit?
>
> I've done some Cluster changes, so these are my baselines:
>
> q=*:*
> fq=created_at_tdid:[1392768004 TO 1393944400] (uncached at this point)
> ~7.5 seconds
>
> q=*:*
> fq={!cache=false}created_at_tdid:[1392768005 TO 1393944400]
> ~7.5 seconds (I guess this is what you were trying to show me?)

Correct.

> The thing is, my queries always more "specific" than that, so given a
> string:
>
> q=*:*
> fq=text:basketball
> fq={!cache=false}created_at_tdid:[1392768007 TO 1393944400]
> ~5.2 seconds
>
> q=*:*
> fq=text:basketball
> fq={!cache=false}created_at_tdid:[1392768005 TO 1393944400]
> ~1.6 seconds
>
> Is there no hope for my first time fq searches being as fast as non-cached
> fqs?

Not really...
Think of it as an optimization in the cache=false case... we can
consider the rest of the request context (and use it to skip over a
lot of useless work).  If a filter will be cached, it must be valid
for all contexts, not just the current one (hence we can't skip any
work generating it).

An analogy... say that my job is determining if a given number is
prime.  Call it a prime-number filter.
Cached case: I figure out all prime numbers less than 1000 and save
the list (and have a very fast lookup).  Every person that comes up to
me after that and gives me a single number (or a small handful) I can
quickly answer if it's prime or not.
Uncached case: A single person walks up to me and asks if a single
number is prime.  I check that number only.

The uncached case is obviously faster the first time because much less
work is done.

-Yonik
http://heliosearch.org - facet functions, subfacets, off-heap filters&fieldcache

Mime
View raw message