db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Suresh Thalamati <suresh.thalam...@gmail.com>
Subject Re: [jira] Commented: (DERBY-2254) Assert during log file switch: log file position exceeded max log file size
Date Mon, 22 Jan 2007 20:59:41 GMT
Olav Sandstaa (JIRA) wrote:
>     [ https://issues.apache.org/jira/browse/DERBY-2254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12466475
> Olav Sandstaa commented on DERBY-2254:
> --------------------------------------
> Mike, thank you for commenting on this and for explaining what you
> think are the issues behind this problem. I agree that the best
> solution seems be to fix this so that the background thread can do
> log file switching without waiting for the checkpoing to complete.

checkpoint and log switch can run in parallel (NOT sure we ever tested 
this case though ). I think problem here is we are dumping both the 
requests on to the same thread. It is more of performance issues.
May be in the future log switches/checkpoints/... should be processed 
on different threads, when cache is configured with large size.

I think some log size calculations are wrong, if this bug is fixed 
Derby  should still work by correctly switching log file before 
hitting the MAX limit.

> Some questions somebody maybe know the answer to:
>   -What is the reason for most of the log files having the default
>    size of 1 MB but a few (4) log files grow to 268 kB before
>    switching?

Default log file size to start a Log switch is 1MB and to perform a 
checkpoint is 10MB. Normally log switch requests are processed by 
rawStoreDaemon(checkpoint Daemon) thread without blocking the user 
threads for switch to complete.

Log switch happens is following three scenarios:

1) On log flush call if it finds current log file size > configure log 
file size,  then log switch request is placed on the rawstore daemon 

2) As part of checkpoint ,  log switch is done before performing the
    checkpoint if  log file size > configured log file size.

3) While writing log records if  it finds the log file size has
    reached the MAX_LIMIT.

4) During backup, to backup full log files instead of partial portions.

By looking at the log file sizes you posted my guess is LOG SWITCH
requests are getting struck behind the checkpoint request. I would
think with 18GB cache, checkpoint must be taking long time.

Another case log file size grow bigger than configured size is  if the
transactions are  long  and nothing in the database triggered a flush 
call. I am guessing that is not case here , your application must be 
doing frequent commits; otherwise it is unlikely that there are 1M log 

>   -The crash occurs after having produced about 1 GB of log. Is there
>    a hard limit on 1 GB before the checkpoint must be completed?

I have not seen any such limit. Unless you have changed the defaults 
checkpoint is requested every  10MB , unless checkpoint is already in 
progress.  I think the  checkpoint are taking long time, parallel 
activity is producing lot of log.   By the way how did you come up
with 1GB number ?  I see only  few 1MB log files and huge files in
between (268 MB)...  My guess is checkpoint is is in progress when
those large files are being created.

If you want to confirm this hypothesis, you can dump the whole log on 
recovery, or just modify the code to print only checkpoints from the 
log. you can dump the log by setting the following flags in the debug 
code line :


> This crash happened during some experimental testing where I want to
> study data rates to and from the disks during high load. I can do this
> with a much smaller database buffer so being able to run with a 18 GB
> database buffer is not an urgent issue for me right now.


View raw message