db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Matrigali <mikem_...@sbcglobal.net>
Subject Re: question about istats thresholds
Date Fri, 11 Feb 2011 22:29:44 GMT
Thanks, could you take a look at DERBY-4211.  It looks like
the stat updater is running, but I don't think it should be.
basically what would you expect to happen on a newly created
table, that then has 7 rows added to it.

case one: then queries are run from ij
case two: an index is created on the table, and then queries are run from ij

Kristian Waagan wrote:
> On 11.02.11 20:11, Mike Matrigali wrote:
>> From DERBY-4934 i see there are the following thresholds:
>>  a) derby.storage.indexStats.debug.createThreshold (100)
>>  b) derby.storage.indexStats.debug.absdiffThreshold (1000)
>>  c) derby.storage.indexStats.debug.lndiffThreshold (1.0)
>>  d) derby.storage.indexStats.debug.queueSize (5)
>> My question is that I don't understand how they are expected to 
>> interact.  If a table has less than 100 rows does that mean
>> stat will not be created even if b or c is exceeded.
> Hi Mike,
> To start with, you can probably ignore threshold (d) for now.
> It applies to the scheduling phase - that is when the unit of work is
> scheduled with the daemon - and to get that far (at least) one of the
> other thresholds has to be exceeded. If the queue is full the unit of
> work won't be scheduled, and another attempt may be made at a later time
> during another statement compilation. This requires that someone
> actually compiles a relevant query, or potentially that the existing
> statement is recompiled (stale plan check).
> The purpose of (d) is to avoid excessive queue growth. Since the queue
> is implemented as a list, searching it for duplicates may also be
> expensive if it grows too large.
> Threshold (a) concerns indexes without existing statistics. If there are
> less than 100 rows in the base table, statistics won't be created.
> Thresholds (b) and (c) concern indexes with existing statistics.
> Threshold (b) was introduced to avoid too frequent updates of existing
> statistics for small tables. I don't remember off the top of my head
> where it was discussed, but I ended up effectively removing it by
> setting it to zero for now. I kept the property (and the relevant code)
> to allow people to experiment somewhat without having to recompile the
> code if they have an application running into trouble with this scenario.
> Finally, the main threshold for existing statistics is (c). Here the
> natural logarithms of the row estimate of the index statistics and the
> row estimate of the base table are compared. If the difference is
> greater than or equal to lndiffThreshold (defaults to 1.0) the
> statistics for the index are scheduled for update. If the daemon queue
> is full the request is discarded, assuming another compilation will
> manage to schedule the update eventually.
> Hope this helped a bit, feel free to ask additional questions. As I have
> said before, these threshold may have to be changed significantly as we
> test the feature (remove existing, add new ones, or modify existing ones).
> Cheers,

View raw message