db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kristian Waagan <kristian.waa...@oracle.com>
Subject Re: question about istats thresholds
Date Thu, 17 Feb 2011 12:03:44 GMT
Den 11.02.2011 23:29, skrev Mike Matrigali:
> Thanks, could you take a look at DERBY-4211.  It looks like
> the stat updater is running, but I don't think it should be.
> basically what would you expect to happen on a newly created
> table, that then has 7 rows added to it.

FYI, in an attempt to stabilize the regression tests, I committed a 
patch (under DERBY-4940) which increases the default absdiff threshold 
from zero to 1000.
Let's see if that's enough, or if there are other aspects of the istat 
daemon interfering with the test(s).


> case one: then queries are run from ij
> case two: an index is created on the table, and then queries are run 
> from ij
> Kristian Waagan wrote:
>> On 11.02.11 20:11, Mike Matrigali wrote:
>>> From DERBY-4934 i see there are the following thresholds:
>>>  a) derby.storage.indexStats.debug.createThreshold (100)
>>>  b) derby.storage.indexStats.debug.absdiffThreshold (1000)
>>>  c) derby.storage.indexStats.debug.lndiffThreshold (1.0)
>>>  d) derby.storage.indexStats.debug.queueSize (5)
>>> My question is that I don't understand how they are expected to 
>>> interact.  If a table has less than 100 rows does that mean
>>> stat will not be created even if b or c is exceeded.
>> Hi Mike,
>> To start with, you can probably ignore threshold (d) for now.
>> It applies to the scheduling phase - that is when the unit of work is
>> scheduled with the daemon - and to get that far (at least) one of the
>> other thresholds has to be exceeded. If the queue is full the unit of
>> work won't be scheduled, and another attempt may be made at a later time
>> during another statement compilation. This requires that someone
>> actually compiles a relevant query, or potentially that the existing
>> statement is recompiled (stale plan check).
>> The purpose of (d) is to avoid excessive queue growth. Since the queue
>> is implemented as a list, searching it for duplicates may also be
>> expensive if it grows too large.
>> Threshold (a) concerns indexes without existing statistics. If there are
>> less than 100 rows in the base table, statistics won't be created.
>> Thresholds (b) and (c) concern indexes with existing statistics.
>> Threshold (b) was introduced to avoid too frequent updates of existing
>> statistics for small tables. I don't remember off the top of my head
>> where it was discussed, but I ended up effectively removing it by
>> setting it to zero for now. I kept the property (and the relevant code)
>> to allow people to experiment somewhat without having to recompile the
>> code if they have an application running into trouble with this 
>> scenario.
>> Finally, the main threshold for existing statistics is (c). Here the
>> natural logarithms of the row estimate of the index statistics and the
>> row estimate of the base table are compared. If the difference is
>> greater than or equal to lndiffThreshold (defaults to 1.0) the
>> statistics for the index are scheduled for update. If the daemon queue
>> is full the request is discarded, assuming another compilation will
>> manage to schedule the update eventually.
>> Hope this helped a bit, feel free to ask additional questions. As I have
>> said before, these threshold may have to be changed significantly as we
>> test the feature (remove existing, add new ones, or modify existing 
>> ones).
>> Cheers,

View raw message