db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jack Klebanoff <klebanoff-de...@sbcglobal.net>
Subject Re: [PATCH] BackingStoreHashtable
Date Thu, 03 Mar 2005 23:24:53 GMT
Mike Matrigali wrote:

> Thanks for the reply, I'll work with you to get this committed.  I will
> wait on the change you are working on. I think that is the best short
> term solution, as you point out there is more work later on to improve
> the work you have done. I would appreciate it if at least one other 
> person with experience on the language side take a look at this also.
> It has been awhile since I looked at jvm memory stuff, but it use to be
> a problem that totalMemory() would return the memory that the jvm
> currently has, not the amount of memory that it is allowed to have.  So
> if you called it after just starting it might return a very small 
> number, say 1 meg, even if the jvm was started and told to grow to a max
> of 100 meg.  Worse was that the behavior was not consistent across 
> JVM/OS combinations.
> This memory issue is a real problem as there are a number of things
> that derby could do faster if it knew it could do the whole thing in
> memory, but once you run out of memory it is hard to recover without
> failing the current operation (and quite possibly other derby threads 
> and in a server environment other non derby threads).
> At one point sun was proposing some jvm interfaces so one could tell if
> you were getting "close" to running out of memory - so that applications
> could take action before errors happened.  If such a thing existed then
> something like BackingStoreHashTable could grow in memory more 
> aggressively and then if it noticed the impending problem it could spill
> everything to disk, and free up it's current usage.
I have modified my patch so that the optimizer and BackingStoreHashtable 
use the same decision about when a hash table will spill to disk. The 
optimizer calls the JoinStrategy.maxCapacity method to find the maximum 
number of rows that the JoinStrategy can handle in a given number of 
bytes. It rejects the strategy if the estimated row count is larger. 
(Currently the optimizer limits each join to 1M of memory). The 
HashJoinStrategy.maxCapacity method divides the maximum byte count by 
the sum of the size of one row plus the size of a Hashtable entry. The 
NestedLoopJoinStrategy.maxCapacity method always returns 
Interer.MAX_VALUE. The HashJoinStrategy.getScanArgs method passes the 
maximum capacity to the ResultSetFactory.|getHashScanResultSet method, 
so that the actual BackingStoreHashtable will spill to disk when the 
optimizer thought that it would. This means that hash joins will not 
spill to disk unless the inner table has more rows than the optimizer 

I also changed the DiskHashtable implementation to pass its 
keepAfterCommit parameter on to the 
TransactionController.openConglomerate method. Previously DiskHashtable 
only used keepAfterCommit to construct the temporaryFlag argument of 
||TransactionController.createConglomerate and always passed "false" as 
the hold argument of ||TransactionController.openConglomerate.

Since I made changes to the optimizer and hash join code generator I 
hope that a Derby language expert can review at least that part of my 
updated patch.

||I have not changed the way that BackingStoreHashtable decides when to 
spill when its max_inmemory_rowcnt parameter is negative. (Only hash 
joins pass a non-negative ||max_inmemory_rowcnt||). As Mike pointed out, 
spilling when the in memory hash table grows larger than 1% of 
Runtime.totalMemory() is not completely satisfactory. The JVM may be 
able to get more memory and totalMemory() is likely to be small soon 
after the JVM starts up. However, I do not know of anything that is 
better. If totalMemory() grows subsequent ||BackingStoreHashtables will 
be able to use more memory. Since ||BackingStoreHashtables are 
temporary, this does not seem so bad to me.|

Jack Klebanoff

View raw message