lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yonik Seeley <>
Subject Re: Is the act of *caching* an fq very expensive? (seems to cost 4 seconds in my example)
Date Tue, 03 Jun 2014 21:19:08 GMT
On Tue, Jun 3, 2014 at 4:44 PM, Brett Hoerner <> wrote:
> If I run a query like this,
> fq=text:lol
> fq=created_at_tdid:[1400544000 TO 1400630400]
> It takes about 6 seconds. Following queries take only 50ms or less, as
> expected because my fqs are cached.
> However, if I change the query to not cache my big range query:
> fq=text:lol
> fq={!cache=false}created_at_tdid:[1400544000 TO 1400630400]
> It takes 2 seconds every time, which is a much better experience for my
> "first query for that range."
> What's odd to me is that I would expect both of these (first) queries to
> have to do the same amount of work, expect the first one stuffs the
> resulting bitset into a map at the end... which seems to have a 4 second
> overhead?

They are not equivalent.  Caching the filter separately (so it can be
reused in any combination with other queries and filters) means that
*all* docs that match the filter are collected and cached (it's the
collection that is taking the time).

For the {!cache=false} case, Solr executes different code (it doesn't
just skip the caching step).

"""When a filter isn’t generated up front and cached, it’s executed in
parallel with the main query. First, the filter is asked about the
first document id that it matches. The query is then asked about the
first document that is equal to or greater than that document. The
filter is then asked about the first document that is equal to or
greater than that. The filter and the query play this game of leapfrog
until they land on the same document and it’s declared a match, after
which the document is collected and scored."""

So try:
  fq=created_at_tdid:[1400544000 TO 1400630400]

-Yonik - facet functions, subfacets, off-heap filters&fieldcache

View raw message