db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kristian Waagan <krist...@apache.org>
Subject Re: question about istats thresholds
Date Fri, 11 Feb 2011 21:14:29 GMT
On 11.02.11 20:11, Mike Matrigali wrote:
> From DERBY-4934 i see there are the following thresholds:
>  a) derby.storage.indexStats.debug.createThreshold (100)
>  b) derby.storage.indexStats.debug.absdiffThreshold (1000)
>  c) derby.storage.indexStats.debug.lndiffThreshold (1.0)
>  d) derby.storage.indexStats.debug.queueSize (5)
> My question is that I don't understand how they are expected to 
> interact.  If a table has less than 100 rows does that mean
> stat will not be created even if b or c is exceeded.

Hi Mike,

To start with, you can probably ignore threshold (d) for now.
It applies to the scheduling phase - that is when the unit of work is
scheduled with the daemon - and to get that far (at least) one of the
other thresholds has to be exceeded. If the queue is full the unit of
work won't be scheduled, and another attempt may be made at a later time
during another statement compilation. This requires that someone
actually compiles a relevant query, or potentially that the existing
statement is recompiled (stale plan check).
The purpose of (d) is to avoid excessive queue growth. Since the queue
is implemented as a list, searching it for duplicates may also be
expensive if it grows too large.

Threshold (a) concerns indexes without existing statistics. If there are
less than 100 rows in the base table, statistics won't be created.

Thresholds (b) and (c) concern indexes with existing statistics.
Threshold (b) was introduced to avoid too frequent updates of existing
statistics for small tables. I don't remember off the top of my head
where it was discussed, but I ended up effectively removing it by
setting it to zero for now. I kept the property (and the relevant code)
to allow people to experiment somewhat without having to recompile the
code if they have an application running into trouble with this scenario.
Finally, the main threshold for existing statistics is (c). Here the
natural logarithms of the row estimate of the index statistics and the
row estimate of the base table are compared. If the difference is
greater than or equal to lndiffThreshold (defaults to 1.0) the
statistics for the index are scheduled for update. If the daemon queue
is full the request is discarded, assuming another compilation will
manage to schedule the update eventually.

Hope this helped a bit, feel free to ask additional questions. As I have
said before, these threshold may have to be changed significantly as we
test the feature (remove existing, add new ones, or modify existing ones).


> I assume if d is exceeded than no stat is created no matter what
> a, b, and c are.

View raw message