db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Matrigali <mikem_...@sbcglobal.net>
Subject Re: [jira] Commented: (DERBY-298) rollforward will not work correctly if the system happens to crash immediately after rollforward backup.
Date Wed, 25 May 2005 16:47:02 GMT
comments below

Øystein Grøvlen wrote:
>>>>>>"MM" == Mike Matrigali <mikem_app@sbcglobal.net> writes:
>     MM> I spent some time thinking about this and I think the dummy record
>     MM> approach works in the current system assuming the code is as follows:
>     MM> o request for roll forward backup starts
>     MM> o all subsequent write requests suspended
>     MM> o log switch initiated by backup request
>     MM> o dummy record written
>     MM> o backup takes place, any failure during backup somehow marks backup as
>     MM> failed.
>     MM> o backup succeeds, rest of system is allowed to continue writing.
> Except from the dummy record part, this seems to be the current
> sequence. (Maybe we should call it "start of backup log record"
> instead of dummy record).  I am bit uncertain about whether a failure
> marks backup as failed.  I could only find code for the case where an
> old backup existed at the same location.  In that case the old backup
> will be restored if the current backup fails.  There does not seem to
> be any error handling for the case where no previous backup exists.
>     MM> Will a similar approach work when we support real online backup, where
>     MM> threads are allowed to continue writing to the log files during the
>     MM> backup?
> I am not sure how Suresh plans to implement restore of a real online
> backup.  Today, the purpose of a log switch is to separate the log
> records that need to be copied during backup from later log records.
> When online backup is performed, one will also need log generated
> during the backup.  Maybe it then would make sense to do the log
> switch at the end of the backup.  Anyhow, there will still be a need
> for a dummy record in the new log file since there is no guarantee
> that there will be any concurrent updates.
>     MM> Also at some point we should be allowing copies of each log file as it
>     MM> is finished - or at least as some unit of work is finished - maybe log
>     MM> file, maybe checkpoint. In that case would we need to write a dummy
>     MM> record in every file?
> Are you thinking about some kind of archival service where log files
> are automatically copied when the database switches to a new log file?
> I guess what is important is to make sure that after recovery, the
> database does not start to log to a file that has already been copied.
> This makes me think that this is a general recovery problem, and
> should not just be fixed in the context of backup.  We should then fix
> recovery to redo any log switch that was completed before the crash.
> One way to do this is to write a log record to the new log file on
> every log switch.  This would work with the current recovery
> implementation.  An alternative is to write some bytes somewhere in
> the new log file in order to signal that the log switch is completed
> and change the redo scan to handle that.

Yes, at some point I think we should implement some sort of archival 
service as described.  Probably allowing users to override the copy
if they want so they could handle copying the file to another machine,
a different device, ....

And I also believe this may be a general recovery problem and should
be solved not just in context of current backup.  It may be the case
that current backup is only place that sees the problem, but would
like to get agreement on best solution going forward.

As you suggest I think there needs to be some way for recovery to
determine a successful log switch has happened and to not log to the
old file.  I don't think a log record for the switch works as you need
to read the log to apply the record, but I think this is problematical
as you are trying to determine validity of the log by looging at record
in the log.  I think some approach as you and suresh suggest with 
markers in the header of the individual log files is a better approach.

View raw message