db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Oystein.Grov...@Sun.COM (Øystein Grøvlen)
Subject Re: [jira] Commented: (DERBY-298) rollforward will not work correctly if the system happens to crash immediately after rollforward backup.
Date Wed, 25 May 2005 14:05:32 GMT
>>>>> "MM" == Mike Matrigali <mikem_app@sbcglobal.net> writes:

    MM> I spent some time thinking about this and I think the dummy record
    MM> approach works in the current system assuming the code is as follows:

    MM> o request for roll forward backup starts
    MM> o all subsequent write requests suspended
    MM> o log switch initiated by backup request
    MM> o dummy record written
    MM> o backup takes place, any failure during backup somehow marks backup as
    MM> failed.
    MM> o backup succeeds, rest of system is allowed to continue writing.

Except from the dummy record part, this seems to be the current
sequence. (Maybe we should call it "start of backup log record"
instead of dummy record).  I am bit uncertain about whether a failure
marks backup as failed.  I could only find code for the case where an
old backup existed at the same location.  In that case the old backup
will be restored if the current backup fails.  There does not seem to
be any error handling for the case where no previous backup exists.

    MM> Will a similar approach work when we support real online backup, where
    MM> threads are allowed to continue writing to the log files during the
    MM> backup?

I am not sure how Suresh plans to implement restore of a real online
backup.  Today, the purpose of a log switch is to separate the log
records that need to be copied during backup from later log records.
When online backup is performed, one will also need log generated
during the backup.  Maybe it then would make sense to do the log
switch at the end of the backup.  Anyhow, there will still be a need
for a dummy record in the new log file since there is no guarantee
that there will be any concurrent updates.

    MM> Also at some point we should be allowing copies of each log file as it
    MM> is finished - or at least as some unit of work is finished - maybe log
    MM> file, maybe checkpoint. In that case would we need to write a dummy
    MM> record in every file?

Are you thinking about some kind of archival service where log files
are automatically copied when the database switches to a new log file?
I guess what is important is to make sure that after recovery, the
database does not start to log to a file that has already been copied.
This makes me think that this is a general recovery problem, and
should not just be fixed in the context of backup.  We should then fix
recovery to redo any log switch that was completed before the crash.
One way to do this is to write a log record to the new log file on
every log switch.  This would work with the current recovery
implementation.  An alternative is to write some bytes somewhere in
the new log file in order to signal that the log switch is completed
and change the redo scan to handle that.


View raw message