Return-Path: Delivered-To: apmail-db-derby-dev-archive@www.apache.org Received: (qmail 21553 invoked from network); 9 Mar 2009 17:46:15 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 9 Mar 2009 17:46:15 -0000 Received: (qmail 78826 invoked by uid 500); 9 Mar 2009 17:46:15 -0000 Delivered-To: apmail-db-derby-dev-archive@db.apache.org Received: (qmail 78618 invoked by uid 500); 9 Mar 2009 17:46:14 -0000 Mailing-List: contact derby-dev-help@db.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: Delivered-To: mailing list derby-dev@db.apache.org Received: (qmail 78609 invoked by uid 99); 9 Mar 2009 17:46:14 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 09 Mar 2009 10:46:14 -0700 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 09 Mar 2009 17:46:12 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id AA706234C004 for ; Mon, 9 Mar 2009 10:45:50 -0700 (PDT) Message-ID: <619428549.1236620750696.JavaMail.jira@brutus> Date: Mon, 9 Mar 2009 10:45:50 -0700 (PDT) From: "Mamta A. Satoor (JIRA)" To: derby-dev@db.apache.org Subject: [jira] Updated: (DERBY-3788) Provide a zero-admin way of updating the statisitcs of an index In-Reply-To: <1231890337.1216665811829.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/DERBY-3788?page=3Dcom.atlassia= n.jira.plugin.system.issuetabpanels:all-tabpanel ] Mamta A. Satoor updated DERBY-3788: ----------------------------------- Attachment: DERBY3788_update_all_missing_stats_after_bind_patch4_diff.t= xt DERBY3788_update_all_missing_stats_after_bind_patch4_stat.t= xt I have an intermediate patch ready for review. The goal of the patch is to = detect early on in the compile phase of a SELECT query if there are tables = involved with missing statistics. If yes, then abandon the SELECT query com= pilation, create the missing statistics and then try to compile the origina= l SELECT query again. This logic gets driven inside the GenericStatement.pr= epMinion code. GenericStatement.prepMinion is responsible for compile phase= of a SQL. If the SQL involved is a SELECT statement, then we may go throug= h the compile process of the SELECT query twice. If the SQL involved is no= n-SELECT, we will finish SQL compilation in the first pass. The details abo= ut the code flow for SELECT statement compilation is as follows. For a SELECT statement, we may have to go through the query compilation pha= se twice. Whether we go through one cycle or 2 cycles of compilation phase = depends on wheter there are any tables involved in the query who have their= statistics missing. During the first attempt of query compile, we will mak= e a list of all the tables involved in the query during the bind phase of c= ompile. After the bind phase, during the first pass through the SELECT quer= y compilation, we will check if the query involved has any tables whose sta= tistics are missing. If yes, then we will quit from the SELECT compile phas= e by throwing missing statistics exception. We will handle the missing stat= s exception by trying to update the missing statistics. After that, we will= go through the second pass of original query compilation. During this seco= nd pass, we will not worry if the statistics are missing or not. We will ju= st work with whatever statistics are available (one example where the stati= stics may still be missing after trying to create them is say a user who on= ly has select privileges on a table finds the missing stats. But the statis= tics creation will fail because the user does not have sufficient privilege= s to create the statistics. There can be other reasons for statistics creat= ion failure too. Another instance would be that we can't get the necessary = locks to update the stats. In such a case, we will just quit out of updatin= g the stats and move on to the second pass of the original query compilatio= n.=20 One of the issues with this patch I need to work on and will appreicate if = anyone has any feedback on. The update statistics is happening in the user = transaction. What that means is that any locks required to update statistic= s will stay in place until the user transaction has been committed/rolled b= ack. This behavior can be an issue with user applications (since these are = the locks the user was not expecting to get as part of a SELECT query execu= tion). This behavior definitely causes problems with our tests where quite = a handful of tests run into locking issues because of the additional locks = acquired before of a SELECT query. I think one way to fix this would be to = somehow run the statistics in a nested transaction which can be committed a= fter the statistics are created. If we run into locking issues in the neste= d transaction then go ahead and run the statistic in the user transaction. = This is what we do for GENERATED columns in InsertResultSet.getSetAutoincre= mentValue. One of the problems that I need to address before I can use nest= ed transaction is to change the code that is written to fire statistics dur= ing SELECT compilation. Currently, in my patch, I am executing the stored p= rocedure SYSCS_UTIL.SYSCS_UPDATE_STATISTICS which internally executes ALTER= TABLE.. to update the statistics. I am at too high a level to be able to u= se nested transaction for updating the statistics (let me know if I am wron= g here. May be there is a way to use nested transaction while firing the SY= SCS_UTIL.SYSCS_UPDATE_STATISTICS that I am unaware of). I think I need to g= et to update statistics code in AlterTableConstantAction directly somehow w= ithout going through the stored procedure->ALTER TABLE.. sql path. I think = if I can directly call the update statistics in AlterTableConstantAction, t= hen I can try using the nested yransaction and if that does not succeed the= n user the user transaction. I will appreciate any input you might have on my overall approach to this j= ira entry and then how to address the locking issue. > Provide a zero-admin way of updating the statisitcs of an index > --------------------------------------------------------------- > > Key: DERBY-3788 > URL: https://issues.apache.org/jira/browse/DERBY-3788 > Project: Derby > Issue Type: New Feature > Components: Performance > Affects Versions: 10.5.0.0 > Reporter: Mamta A. Satoor > Assignee: Mamta A. Satoor > Attachments: DERBY3788_patch1_diff.txt, DERBY3788_patch1_stat.txt= , DERBY3788_patch2_diff.txt, DERBY3788_patch2_stat.txt, DERBY3788_patch3_di= ff.txt, DERBY3788_patch3_stat.txt, DERBY3788_update_all_missing_stats_after= _bind_patch4_diff.txt, DERBY3788_update_all_missing_stats_after_bind_patch4= _stat.txt, DERBY_3788_Mgr.java, DERBY_3788_Repro.java > > > DERBY-269 provided a manual way of updating the statistics using the new = system stored procedure SYSCS_UTIL.SYSCS_UPDATE_STATISTICS. It will be good= for Derby to provide an automatic way of updating the statistics without r= equiring to run the stored procedure manually. There was some discussion on= DERBY-269 about providing the 0-admin way. I have copied it here for refer= ence. > ********************* > Kathey Marsden - 22/May/05 03:53 PM=20 > Some sort of zero admin solution for updating statistics would be preffer= able to the manual 'update statistics'=20 > ********************* > ********************* > Mike Matrigali - 11/Jun/08 12:37 PM=20 > I have not seen any other suggestions, how about the following zero admin= solution? It is not perfect - suggestions welcome.=20 > Along with the statistics storing, save how many rows were in the table w= hen exact statistics were calculated. This number is 0 if none have been ca= lculated because index creation happened on an empty table. At query compil= e time when we look up statistics we automatically recalculate the statisti= cs at certain threshholds - say something like row count growing past next = threshhold : 10, 100, 1000, 100000 - with upper limit being somewhere aroun= d how many rows we can process in some small amount of time - like 1 second= on a modern laptop. If we are worried about response time, maybe we backgr= ound queue the stat gathering rather than waiting with maybe some quick loa= d if no stat has ever been gathered. The background gathering could be opti= mized to not interfere with locks by using read uncommitted.=20 > I think it would be useful to also have the manual call just to make it e= asy to support customers and debug issues in the field. There is proably al= ways some dynamic data distribution change that in some case won't be picke= d up by the automatic algorithm. Also just very useful for those who have c= omplete control of the create ddl, load data, run stats, deliver applicatio= n process.=20 > ********************* --=20 This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.