db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Suresh Thalamati <suresh.thalam...@gmail.com>
Subject Re: [jira] Commented: (DERBY-239) Need a online backup feature that does not block update operations when online backup is in progress.
Date Wed, 27 Jul 2005 22:16:51 GMT
Øystein Grøvlen wrote:

>>>>>>"ST" == Suresh Thalamati <suresh.thalamati@gmail.com> writes:
>>>>>>            
>>>>>>
>
>.....<snip>   
>
>    ST> I agree it  by writing log record for start of  backup, we can prevent
>    ST> garbage-collection of log files.
>
>    ST> My  initial thought  is to  simply disable  garbage-collection  of log
>    ST> files for the  duration of the backup. unless  there are some specific
>    ST> advantages in writing backup-start log record.
>
>Disabling garabage-collection directly is probably the cleanest way to
>do this.
>
>How will you determine where to start the redo scan at recovery?  Do
>you need some mark in the log for that purpose?
>
>  
>
   I believe using the checkpoint information available at the start of 
the backup,  redo scan staring point can be determined at recovery.
   log.ctrl  file  contains the  checkpoint information,  this file  
should be copied to the backup  after disabling the  log-file garbage 
collection,
   but before stating data files copy operation.


>   
>
>
>    >> Generally, we cannot give a guarantee that operations that are
>    >> performed during backup are reflected in the backup.  If I have
>    >> understand correctly, transactions that commits after the data copying
>    >> is finished, will not be reflected.  Since a user will not be able to
>    >> distiguish between operations committed during data copying and
>    >> operations committed during log copying, he cannot be sure concurrent
>    >> operations is reflected in the backup.
>    >> 
>    >> 
>
>
>    ST> I agree with you that , one can not absolutely guarantee that
>    ST> backup will include operations committed till a particular
>    ST> time are included in the backup.  But the backup design
>    ST> depends on the transactions log to bring the database to
>    ST> consistent state , because when data files are being copied ,
>    ST> it is possible that some of the page are written to the disk.
>    ST> So we need the transaction log until the data files are copied
>    ST> for sure. If a user commits a non-logged operation when data
>    ST> files are being copied , he/she would expect it to be there in
>    ST> the backup, similar to a logged operation.
>
>My point was that a user will not be able to distiguish between the
>data file copying period and the log copying period.  Hence, he does
>not know whether his operation was committed while the data files was
>being copied.
>
>    ST> Please note that non-logging operation in Derby are not
>    ST> explicit to the users, most of non-logging stuff is done by
>    ST> the system without the user knowledge.
>
>I understand.
>
>
>    >> This is not more of an issue for a new backup mechanism than it is
>    >> currently for roll-forward recovery.  Roll-forward recovery will not be
>    >> able recover non-logged operations either.
>
>    ST> Yes. Roll-forward recovery has same issues, once the log
>    ST> archive mode that is required for roll-forward recovery is
>    ST> enabled all the operations are logged including the operations
>    ST> that are not logged normally like create index.  But I think
>    ST> the currently derby does not handle correctly . it does not
>    ST> force logging for non-logged operations that were started
>    ST> before log archive mode is enabled.
>
>The cheapest way to handle non-logged operations that started before
>backup/archive mode enabling, is to just make them fail and roll them
>back.  I think that would be an acceptable solution.
>
>  
>

  I like the idea,  but  I am not sure how users will react  if  an 
operation fails in the middle because of  a  backup/archive mode
  I think one of the following options may be more acceptable:
          
           1)  make backup/archive mode process wait until  all the 
transaction that has the non-logged
               operations  are  committed.
           
            or

            2)  convert the non-logged operation to logging mode after 
flushing the containers , once  the backup starts.
 
My preference is option 1) ,  it might be less complicated than option 2).
 

>    >> If users needs that, we
>    >> should provide logged version of these operations.
>    >> 
>    >> 
>
>    ST> I think, during  backup non-logged operations should be  logged by the
>    ST> system or block them. 
>
>I think blocking them should be acceptable to most users.
>
>  
>

I think converting non-logged operations to logged operations  may be a  
better choice. 
If a user wants to  create indexes/import small amount of data during 
backup , they will still be able to do.
In case if user is concerned about performance of these operation , they 
can stop the
backup or wait until backup is done. If the  database is in the log 
archive mode ,
they can disable using  SYSC_UTIL.SYSC_DISABLE_LOG_ARCHIVE_MODE.
and  re enable the archive mode with a fresh full backup.


>    ST> If user is really concerned of performance they will not
>    ST> execute them in parallel.
>
>This advice may work for backup, but not for enabling roll-forward
>recovery.  If I was user that was concerned with performance, I think
>I would prefer to still create an index unlogged and rather recreate
>it if recovery is needed.  (I guess this would require roll-forward
>recovery to ignore updates to non-existing indexes.)  I could limit
>the vulnerability by making a backup after unlogged operations have
>been performed.
>
>  
>
   I like the idea of rebuilding the indexes during recovery ,   but we 
may want to do it as
   a different project.   
 

>By the way, how is normal recovery of unlogged operations handled? Is
>the commit of unlogged operations delayed until all data pages created
>by the operation have been flushed to disk?
>
>  
>
 
  Yes.  I think at the commit time all  unlogged containers pages in the 
cache are  flushed to the disk.
  To my knowledge,  all the non logged operation happen on  new 
containers and the container creation part is  logged ,
  if a crash occurs before the commit .  container will be dropped by 
the rollback of the CREATE log record.
 

Thanks
-suresh



Mime
View raw message