db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Knut Anders Hatlen <Knut.Hat...@Sun.COM>
Subject Re: Stored Prepared Statements? -- Thoughts on exposing StorablePreparedStatements to user/external statements
Date Sun, 19 Apr 2009 09:39:32 GMT
Nick Puz <nick.derby@me.com> writes:

>  Hi Knut,
>
> Thanks for the response, I'll take a look into the cloudscape docs and
> how execute is implemented. I'll also look more at the statement cache
> to see about persisting it on shutdown and restoring on startup, so
> that then it could be done globally for all statements (as an
> option). Another alternative would be to let the user specify which to
> persist as a comment in the sql (as is done with the optimizer
> overrides to specify index usage, etc.), possibly as a different
> runtime option. The advantage of the first (global) is that apps could
> take advantage with no sql changes. Perhaps persist the N most
> recently used statements...

Hi Nick,

I agree that no SQL changes required is an advantage. Applications could
start taking advantage of it immediately with no changes (except perhaps
setting a flag to enable the persisting of the statement cache), and no
new, non-standard syntax is required.

As to using comments, that would require changes in the SQL, but the SQL
would still be portable since the comments would simply be ignored by
other DMBSs.

Adding the ability to store user-defined prepared statements with some
special syntax is probably the easiest to implement, since most of the
infrastructure is already in place, but then the applications would need
to use unportable SQL. (Although we don't actually need to add new
syntax. We already have EXECUTE STATEMENT, and we could just create a
system procedure to store the statements, something like CALL
SYSCS_UTIL.SYSCS_STORE_STATEMENT('APP', 'MYSTMT', 'SELECT * FROM
T'). But even if the syntax is portable, the actual statement will still
be tied to Derby.)

So, not considering the complexity of the different implementations, a
persistent statement cache and/or use of comments sounds more attractive
to me.

If we go for persisting the statement cache, I think it would be easier
to just store the entire cache than to store the N most recently used,
both because Derby's statement cache currently doesn't have a notion
about "most recently used", and because the statement cache normally is
so small (25 statement by default, I think) as I wouldn't think the
extra complexity is worthwhile.

> In terms of invalidating, a crude but simple way would be to have the
> version of derby stored with the prepared statements and just to
> discard them if the version numbers differ. The same mechanism coudl
> be used as when deciding to recompile the system stored
> statements. What do you think?

Reusing as much as possible of the existing code would be good. So
basically what needs to be done is

  1) On shutdown, if persistent cache is enabled, store each of the
  cached statements in SYS.SYSSTATEMENTS.

  2) On startup, move the stored statements from SYS.SYSSTATEMENTS back
  into the cache, and remove them from SYS.SYSSTATEMENTS.

  3) On upgrade, Derby will automatically invalidate all compiled plans
  for statements in SYS.SYSSTATEMENTS, so no need for any special
  handling for the persisted cached statements (though, strictly
  speaking it would be better to drop those statements than to fill the
  cache with uncompiled statements).

  4) On soft-upgrade, I think it might be a good idea to disable
  persisting of the cache entirely, since we wouldn't want to end up
  with SYS.SYSSTATEMENTS containing statements compiled for different
  versions of Derby.

I think (must check to be sure) that in the current code, Derby needs to
know whether or not the statement should be a stored prepared statement
or an ordinary statement when the statement is compiled, and that an
entry is added in the SYS.SYSSTATEMENTS table before the statement is
compiled. If this is done for all statements, even the ones that won't
still be in the cache when the database shuts down, there will be an
extra, unnecessary compile-time cost for those statements.

This could be stated as an expected overhead caused by enabling the
persistent cache, but it also makes a solution where the users
explicitly say which statements they want to persist attractive. That
again leaves us with the question of how to dispose of them, since we
surely don't want to keep all the different variations over the same
statement that different versions of an application have used. Removing
such statements from SYS.SYSSTATEMENTS when they are evicted from the
statement cache is one option, I would guess.

Another perhaps better option if we want to store the entire statement
cache, is to find a way to convert a non-storable statement to a
storable statement when we decide that it should be stored. Don't know
how difficult that would be.

> Besides version changes the statements should probably be
> removed/regenerated if the statistics used to determine the execution
> plan change. How does derby manage the statistics, are they
> automatically kept up to date or does the user have to do something
> specific? Do you happen to know what happens with the stored trigger
> actions, since they should have the same problem. I'll take a look at
> the code to try to figure it out in the next few days but just curious
> if you or another derby dev knows...

There is an undocumented (doc request logged as DERBY-3134) property,
derby.language.stalePlanCheckInterval, which tells how often the plan
for a prepared statement should be checked. If the plan is found to be
stale (not sure if it looks at cardinality statistics or if it just
checks the estimated size of the tables involved) a recompile is
initiated. If the statement is an SPS, the new plan is also written back
to the system table.

-- 
Knut Anders

Mime
View raw message