db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mamta A. Satoor (JIRA)" <j...@apache.org>
Subject [jira] Commented: (DERBY-3788) Provide a zero-admin way of updating the statisitcs of an index
Date Tue, 02 Dec 2008 17:18:44 GMT

    [ https://issues.apache.org/jira/browse/DERBY-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12652427#action_12652427

Mamta A. Satoor commented on DERBY-3788:

Knut, couple more responses to some of your comments.
3) Creating an EmbedConnection30 object directly breaks the 
modularity. Unless the calls to the internal methods are necessary, it 
may be better to use InternalDriver.activeDriver().connect(url, info) 
There was no specific need to be directly creating EmbedConnection30 object, so I have used
your recommendation to get the Connection object.

5) There's a comment in DDImpl5.updateStatisticsInBackGround() saying 
that "cm is null the very first time, and whenever we aren't actually 
nested." I'm not sure I understand that comment. Why is it null the 
first time? And isn't the method always called in a nested context? 
And if it is null, wouldn't that cause a NullPointerException in 
EmbedConnection's constructor when url=null is passed in? 
I used the current model to get the ContextManager similar to how we get it in jdbc,InternalDriver:getConnectionContext
and there we do a check for null ContextManager. Looking at that code, I thought there might
be a case where ContextManager could be null. To address the NullPointerException that may
result because of url being null in case of null ContextManager, I have changed the code in
DDImpl5.updateStatisticsInBackGround() to fire the background update statistics only if ContextManager
is not null. So the new code looks as follows
	public void updateStatisticsInBackGround(String schemaName,
			String tableName, String indexName) throws StandardException{
		String url = null;
		Properties info = null;
		if (executorForUpdateStatistics==null)
			executorForUpdateStatistics = new ThreadPoolExecutor(5,5,0L,
		              new LinkedBlockingQueue(5));
					new ThreadPoolExecutor.CallerRunsPolicy());	
		ContextService csf = ContextService.getFactory();

		ContextManager cm = csf.getCurrentContextManager();
		ConnectionContext localCC = null;

			cm is null the very first time, and whenever
			we aren't actually nested.
		if (cm != null) {
			localCC = (ConnectionContext)
			TransactionResourceImpl tr = localCC.getTR();
			url = tr.getUrl();
			info = tr.getInfo();
					BackgroundUpdateStatisticTask (schemaName,
							tableName, indexName, url, info));


Will work on addressing the remaining comments. 

> Provide a zero-admin way of updating the statisitcs of an index
> ---------------------------------------------------------------
>                 Key: DERBY-3788
>                 URL: https://issues.apache.org/jira/browse/DERBY-3788
>             Project: Derby
>          Issue Type: New Feature
>          Components: Performance
>    Affects Versions:
>            Reporter: Mamta A. Satoor
>            Assignee: Mamta A. Satoor
>         Attachments: DERBY3788_patch1_diff.txt, DERBY3788_patch1_stat.txt, DERBY3788_patch2_diff.txt,
DERBY3788_patch2_stat.txt, DERBY_3788_Mgr.java, DERBY_3788_Repro.java
> DERBY-269 provided a manual way of updating the statistics using the new system stored
procedure SYSCS_UTIL.SYSCS_UPDATE_STATISTICS. It will be good for Derby to provide an automatic
way of updating the statistics without requiring to run the stored procedure manually. There
was some discussion on DERBY-269 about providing the 0-admin way. I have copied it here for
> *********************
> Kathey Marsden - 22/May/05 03:53 PM 
> Some sort of zero admin solution for updating statistics would be prefferable to the
manual 'update statistics' 
> *********************
> *********************
> Mike Matrigali - 11/Jun/08 12:37 PM 
> I have not seen any other suggestions, how about the following zero admin solution? It
is not perfect - suggestions welcome. 
> Along with the statistics storing, save how many rows were in the table when exact statistics
were calculated. This number is 0 if none have been calculated because index creation happened
on an empty table. At query compile time when we look up statistics we automatically recalculate
the statistics at certain threshholds - say something like row count growing past next threshhold
: 10, 100, 1000, 100000 - with upper limit being somewhere around how many rows we can process
in some small amount of time - like 1 second on a modern laptop. If we are worried about response
time, maybe we background queue the stat gathering rather than waiting with maybe some quick
load if no stat has ever been gathered. The background gathering could be optimized to not
interfere with locks by using read uncommitted. 
> I think it would be useful to also have the manual call just to make it easy to support
customers and debug issues in the field. There is proably always some dynamic data distribution
change that in some case won't be picked up by the automatic algorithm. Also just very useful
for those who have complete control of the create ddl, load data, run stats, deliver application
> *********************

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message