db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Matrigali <mikem_...@sbcglobal.net>
Subject Re: question about istats thresholds
Date Tue, 15 Feb 2011 22:15:50 GMT
Thanks for taking a look at this.    As you point out the row estimates 
are particularly hard to count on for very small number of rows.  After 
the table gets big the
other params seem ok until more feedback.  I wonder if there should be
just be some minimum table sizes rather than "diff" sizes.  I would lean
toward defaulting to not running stats on a table that has stats unless
it is over some minimum size - say 1000 rows.

So behavior would be:

If we make this change then expected behavior would be:
o no scheduled istat runs for tables with no indexes
o no scheduled istat run for a table with indexes with under 100 rows.
o no scheduled istat runs for a table with indexes and existing stats 
with under 1000 rows.
o tables with over 1000 rows get stats based on current logic.
Kristian Waagan wrote:
> On 11.02.2011 23:29, Mike Matrigali wrote:
>> Thanks, could you take a look at DERBY-4211.  It looks like
>> the stat updater is running, but I don't think it should be.
>> basically what would you expect to happen on a newly created
>> table, that then has 7 rows added to it.
> 
> I've only looked briefly at the test, and here are my thoughts about 
> what's going on:
>  o some of the tables in the test are created, populated and then having 
> an index created. Since the table is not empty, the index creation will 
> cause statistics to be generated.
>  o queries in the test will then cause the istat scheduling logic to fire.
>  o due to inaccurate row estimates for the table the istat incorrectly 
> schedules an update.
> 
> My opinion (after having looked very quickly at this) is that the istat 
> code is doing as it should with the current parameters. The bad behavior 
> is caused by a combination of poor information quality (the row 
> estimate), the low number of rows in the table ("defeats" the 
> logarithmic threshold), and the istat configuration (absdiff=0).
> Since the row estimate is exactly that - an estimate - it may be wise to 
> reintroduce the absdiff parameter to avoid problems like these for small 
> tables. At least it should be simple to change its value and re-run the 
> test to see if the istat work is still happening or not (note that the 
> value quoted below is wrong - 
> derby.storage.indexStats.debug.absdiffThreshold is currently set to zero).
> 
> There are at least two issues with the row estimate handling:
>  o not logged
>  o there are two ways to update the estimate: using an absolute value, 
> or using deltas. In some cases these two ways interfere, i.e. changes 
> already reflected by a set absolute value are also applied afterwards as 
> delta operations.
> 
>>
>> case one: then queries are run from ij
> 
> If the stat updater is running for case one, where there are no indexes, 
> that's certainly a bug!
> 
> 
> Regards,


Mime
View raw message