Mailing-List: contact derby-dev-help@db.apache.org; run by ezmlm
Precedence: bulk
Reply-To: <derby-dev@db.apache.org>
Message-ID: <1903725599.1218567824578.JavaMail.jira@brutus>
Date: Tue, 12 Aug 2008 12:03:44 -0700 (PDT)
From: "Mike Matrigali (JIRA)" <jira@apache.org>
To: derby-dev@db.apache.org
Subject: [jira] Updated: (DERBY-3788) Provide a zero-admin way of updating
 the statisitcs of an index
In-Reply-To: <1231890337.1216665811829.JavaMail.jira@brutus>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


     [ https://issues.apache.org/jira/browse/DERBY-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mike Matrigali updated DERBY-3788:
----------------------------------


I don't know if any of these are good ideas but maybe someone with knowledge of code can discuss.  They are 
options that came to mind:
1) explore why the error is happening and see if updating stats can be special cased.  It is different than a 
     "real" ddl that might cause havoc changing during a comple.
2) rather than update the stat, just inline calculate and use that for current compile and schedule a later task after
    compile to do the update.
3) give up on updating during compile and just use default, but schedule update after compile and somehow mark query to be recompiled again the next time (this might automatically happen if update of statistic already causes dependency to change on compiled query - i don't know.
4) when you find missing stat, give up on current compile, update stat, and then retry the compile.

As I understood it, you were looking at inline update of stats as a incremental dev step - maybe it is not worth it
given the problems and you should jump to long term strategy of delaying update.  


> Provide a zero-admin way of updating the statisitcs of an index
> ---------------------------------------------------------------
>
>                 Key: DERBY-3788
>                 URL: https://issues.apache.org/jira/browse/DERBY-3788
>             Project: Derby
>          Issue Type: New Feature
>          Components: Performance
>    Affects Versions: 10.5.0.0
>            Reporter: Mamta A. Satoor
>            Assignee: Mamta A. Satoor
>
> DERBY-269 provided a manual way of updating the statistics using the new system stored procedure SYSCS_UTIL.SYSCS_UPDATE_STATISTICS. It will be good for Derby to provide an automatic way of updating the statistics without requiring to run the stored procedure manually. There was some discussion on DERBY-269 about providing the 0-admin way. I have copied it here for reference.
> *********************
> Kathey Marsden - 22/May/05 03:53 PM 
> Some sort of zero admin solution for updating statistics would be prefferable to the manual 'update statistics' 
> *********************
> *********************
> Mike Matrigali - 11/Jun/08 12:37 PM 
> I have not seen any other suggestions, how about the following zero admin solution? It is not perfect - suggestions welcome. 
> Along with the statistics storing, save how many rows were in the table when exact statistics were calculated. This number is 0 if none have been calculated because index creation happened on an empty table. At query compile time when we look up statistics we automatically recalculate the statistics at certain threshholds - say something like row count growing past next threshhold : 10, 100, 1000, 100000 - with upper limit being somewhere around how many rows we can process in some small amount of time - like 1 second on a modern laptop. If we are worried about response time, maybe we background queue the stat gathering rather than waiting with maybe some quick load if no stat has ever been gathered. The background gathering could be optimized to not interfere with locks by using read uncommitted. 
> I think it would be useful to also have the manual call just to make it easy to support customers and debug issues in the field. There is proably always some dynamic data distribution change that in some case won't be picked up by the automatic algorithm. Also just very useful for those who have complete control of the create ddl, load data, run stats, deliver application process. 
> *********************

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.