lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <>
Subject Re: Query Cache Memory Usage - Could it be better?
Date Wed, 29 Oct 2008 15:35:12 GMT

: We've recently switched over to using a more complex DisMax search for
: some of our queries. In the process of measuring the performance impact,
: I noticed that my Query Cache had grown considerably with these new
: Queries. I made sure to use the same data set for before and after

I would ask if maybe the cche size increase is coming from the recently 
discovered bug where dismax (and other types of queries) weren't getting 
unique hashCodes (SOLR-805) but since you are the one that reported that 
bug, i'm going to guess you already ruled that out with a local patch :)

: comparisons, and the same query set where I modified the queries to use
: the new search (qt=foo). Before, my queries took up an average of 800
: bytes. Now they take an average of 3500 bytes. Overall, that will mean I
: will hold 75% less queries in my query cache then I used to. And the
: query cache is very important for performance.

do make any osrt of fair comparison, we need to understand what types of 
queries you were executing against hte stnadard request handler and how 
thta compares with *all* the options you are using on the dismax handler 
(ie: standard q vs dismax q,pf,qf,bf,bq) ... i suspect the dismax quiers 
are much bigger because they are much more complicated.

: is, could we decrease the memory footprint of the queries in the cache,
: by *not* holding onto the Query object itself in the cache at all. Could
: we turn it into a canonical string that represents the query, and use
: that for the key in the cache? I'm assuming that would take up much less
: memory, at least in the case that I'm experiencing.

part of the problem is that there is no canonical string for an arbitrary 
query arg -- and even if the strings were unique, there is no one parser 
that can reconstruct a Query objects from it's string representation 
(which would be neccessary for autowarming)

but i suspect if you are seeing that big of a differencee in the relative 
memory consumption of the Query objects in the cache, it's going to be 
because the dismax queries are that much more complicated then what you 
were using before.


View raw message