db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rick Hillegas <Richard.Hille...@Sun.COM>
Subject Re: [jira] Created: (DERBY-510) DERBY-132 resolved ? Table not automatically compressed
Date Tue, 16 Aug 2005 18:33:15 GMT
Hi Mike,

I like your suggestions that a low priority thread should perform the 
compressions and that we should expose a knob for disabling this thread. 
Here are some further suggestions:

Compressing all the tables and recalculating all the statistics once a 
month could cause quite a hiccup for a large database. Maybe we could do 
something finer grained. For instance, we could try to make it easy to 
ask some question like "Is more than 20% of this table's space dead?" No 
doubt there are some tricky issues in maintaining a per-table dead-space 
counter and in keeping that counter from being a sync point during 
writes. However, if we could answer a question like that, then we could 
pay the compression/reoptimization penalty as we go rather than 
incurring a heavy, monthly lump-sum tax.


Mike Matrigali wrote:

>Full compression of derby tables is not done automatically, I
>am looking for input on how to schedule such an operation.  An
>operation like this is going to have a large cpu, i/o, and
>possible temporary disk space impact on the rest of the server.
>As a zero admin db I think we should figure out some way to
>do this automatically, but I think there are a number of
>applications which would not be happy with such a performance
>impact not under their control.
>My initial thoughts are to pick a default time frame, say
>once every 30 days to check for table level events like
>compression and statistics generation and then execute the operations
>at low priority.  Also add some sort of parameter so that
>applications could disable the automatic background jobs.
>Note that derby does automatically reclaim space from deletes
>for subsequent inserts, but the granularity currently is at
>a page level.  So deleting every 3rd or 5th row is the worst
>case behavior.  The page level decision was a tradeoff as
>reclaiming the space is time consuming so did not want to
>schedule to work on a row by row basis.  Currently we schedule
>the work when all the rows on a page are marked deleted.
>Volker Edelmann (JIRA) wrote:
>>DERBY-132 resolved ? Table not automatically compressed 
>>         Key: DERBY-510
>>         URL: http://issues.apache.org/jira/browse/DERBY-510
>>     Project: Derby
>>        Type: Bug
>>    Versions:    
>> Environment: JDK 1.4.2, JDK 1.5.0
>>Windows XP
>>    Reporter: Volker Edelmann
>> I tried a test-program that repeatedly inserts a bunch of data into 1 table and repeatedly
deletes a bunch of data.
>>    // table is not empty  when test-program starts
>>     derby.executeSelect("select count(*) c from rclvalues");
>>   TestQueries.executeBulkInsertAnalyst(derby.getConnection(), 2000000); // insert
2.000.000 rows
>>        derby.executeDelete("delete from rclvalues where MOD(id, 3) = 0");
>>   TestQueries.executeBulkInsertAnalyst(derby.getConnection(), 1000000);
>>        derby.executeDelete("delete from rclvalues where MOD(id, 5) = 0");
>>     derby.executeSelect("select count(*) c from rclvalues");
>>At the end of the operation, the table contains approximately the same number of rows.
But the size of the database has grown from
>>581 MB to 1.22 GB. From the description of item DERBY-132, I hoped that Derby does
the compression now ( version 10.1.X.X.).  

View raw message