lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: enable disable filter query caching based on statistics
Date Tue, 05 Jan 2016 18:17:20 GMT
&fq={!cache=false}n_rea:xxx&fq={!cache=false}provincia:yyyy,fq={!cache=false}type:zzzz

You have a comma in front of the last fq clause, typo?

Well, the whole point of caching filter queries is so that the
_second_ time you use it,
very little work has to be done. That comes at a cost of course for
first-time execution.
Basically any fq clause that you can guarantee won't be re-used should
have cache=false
set.

I'd be surprised if the second time you use the provincia and type fq
clauses not caching
would be faster, but I've been surprised before. I guess anding two
bitsets together could
take more time than, say, testing a small number of individual documents....

And I'm assuming that you're testing multiple queries rather than just one-offs.

If you _do_ know that some of your clauses are very restrictive, I
wonder what happens if
you add a cost in. fq's are evaluated in cost order (when
cache=false), so what happens
in this case?
&fq={!cache=false cost=101}n_rea:xxx&fq={!cache=false
cost=102}provincia:yyyy&fq={!cache=false cost=103}type:zzzz

Best,
Erick

On Tue, Jan 5, 2016 at 9:41 AM, Matteo Grolla <matteo.grolla@gmail.com> wrote:
> Thanks Erik and Binoy,
>      This is a case I stumbled upon: with queries like
>
> q=*:*&fq={!cache=false}n_rea:xxx&fq={!cache=false}provincia:yyyy,fq={!cache=false}type:zzzz
>
> where n_rea filter is highly selective
> I was able to make > 3x performance improvement disabling cache
>
> I think it's because the last two filters are not so selective, they are
> resolved to two bitset which are then anded together
> and this is less efficient than leapfrogging since the first filter has
> just one or two results.
> Does it make sense to you?
>
>
>
>
>
> 2016-01-05 16:59 GMT+01:00 Erick Erickson <erickerickson@gmail.com>:
>
>> Matteo:
>>
>> Let's see if I understand your problem. Essentially you want
>> Solr to analyze the filter queries and decide through some
>> algorithm which ones to cache. I have a hard time thinking of
>> any general way to do this, certainly there's not hing in Solr
>> that does this automatically As Binoy mentions there are some
>> ways to influence what goes in the cache, but the algorithm is
>> simple....
>>
>> If you build such a thing, I suspect you'll be implicitly building
>> in knowledge of how your particular application uses Solr. For
>> sure, the functionality around "no cache filters" is there explicitly
>> because some fq clauses (think ACL calculations) can be
>> very expensive to calculate for the entire corpus (which is what
>> fqs do by default).
>>
>> But you really haven't given us some examples of what sorts
>> of fq clauses you consider "bad". Perhaps there are other ways
>> of approaching your problem.
>>
>> Best,
>> Erick
>>
>>
>> On Tue, Jan 5, 2016 at 7:50 AM, Binoy Dalal <binoydalal93@gmail.com>
>> wrote:
>> > What is your exact requirement then?
>> > I ask, because these settings can solve the problems you've mentioned
>> > without the need to add any additional functionality.
>> >
>> > On Tue, Jan 5, 2016 at 9:04 PM Matteo Grolla <matteo.grolla@gmail.com>
>> > wrote:
>> >
>> >> Hi Binoy,
>> >>      I know these settings but the problem I'm trying to solve is when
>> >> these settings aren't enough.
>> >>
>> >>
>> >> 2016-01-05 16:30 GMT+01:00 Binoy Dalal <binoydalal93@gmail.com>:
>> >>
>> >> > If I understand your problem correctly, then you don't want the most
>> >> > frequently used fqs removed and you do not want your filter cache to
>> grow
>> >> > to very large sizes.
>> >> > Well there is already a solution for both of these.
>> >> > In the solrconfig.xml file, you can configure the <filterCache>
>> parameter
>> >> > to suit your needs.
>> >> > a) Use the LeastFrequentlyUsed or LFU eviction policy.
>> >> > b) Set the size to whatever number of fqs you find suitable.
>> >> > You can do this like so:
>> >> > <filterCache class="solr.LFUCache" size="100" initialSize="10"
>> >> > autoWarmCount="10"/>
>> >> > You should play around with these parameters to find the best
>> combination
>> >> > for your implementation.
>> >> > For more details take a look here:
>> >> > https://wiki.apache.org/solr/SolrCaching
>> >> > http://yonik.com/advanced-filter-caching-in-solr/
>> >> >
>> >> >
>> >> > On Tue, Jan 5, 2016 at 7:28 PM Matteo Grolla <matteo.grolla@gmail.com
>> >
>> >> > wrote:
>> >> >
>> >> > > Hi,
>> >> > >     after looking at the presentation of cloudsearch from lucene
>> >> > revolution
>> >> > > 2014
>> >> > >
>> >> > >
>> >> >
>> >>
>> https://www.youtube.com/watch?v=RI1x0d-yO8A&list=PLU6n9Voqu_1FM8nmVwiWWDRtsEjlPqhgP&index=49
>> >> > > min 17:08
>> >> > >
>> >> > > I recognized I'd love to be able to remove the burden of disabling
>> >> filter
>> >> > > query caching from developers
>> >> > >
>> >> > > the problem:
>> >> > > Solr by default caches filter queries
>> >> > > a) When there are filter queries that are not reused and few that
>> are
>> >> the
>> >> > > good ones get evicted unnecessarily
>> >> > > b) if the same query has multiple filter queries that are very
>> >> selective
>> >> > I
>> >> > > noticed a big performance disabling cache
>> >> > > c) I'd like to spare developers from deciding what has to be cached
>> or
>> >> > not
>> >> > >
>> >> > > the question:
>> >> > > -Is there anything already working to solve those problems?
>> >> > >
>> >> > > what do you think about this?
>> >> > > -I was thinking to write a plugin to recognize query types with
>> regular
>> >> > > exception and let solr admins associate a caching behaviour with
>> each
>> >> > query
>> >> > > type
>> >> > > -another idea was to
>> >> > >    -by default set fq caching off
>> >> > >    -keep statistics about fq
>> >> > >    -enable caching only for the N fq with highest hit ratio
>> >> > >
>> >> > --
>> >> > Regards,
>> >> > Binoy Dalal
>> >> >
>> >>
>> > --
>> > Regards,
>> > Binoy Dalal
>>

Mime
View raw message