db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Suresh Thalamati <suresh.thalam...@gmail.com>
Subject Re: [jira] Commented: (DERBY-239) Need a online backup feature that does not block update operations when online backup is in progress.
Date Thu, 28 Jul 2005 02:46:29 GMT
Suresh Thalamati wrote:

> Øystein Grøvlen wrote:
>>>>>>> "ST" == Suresh Thalamati <suresh.thalamati@gmail.com> writes:
>> .....<snip>  
>>    ST> I agree it  by writing log record for start of  backup, we can 
>> prevent
>>    ST> garbage-collection of log files.
>>    ST> My  initial thought  is to  simply disable  
>> garbage-collection  of log
>>    ST> files for the  duration of the backup. unless  there are some 
>> specific
>>    ST> advantages in writing backup-start log record.
>> Disabling garabage-collection directly is probably the cleanest way to
>> do this.
>> How will you determine where to start the redo scan at recovery?  Do
>> you need some mark in the log for that purpose?
>   I believe using the checkpoint information available at the start of 
> the backup,  redo scan staring point can be determined at recovery.
>   log.ctrl  file  contains the  checkpoint information,  this file  
> should be copied to the backup  after disabling the  log-file garbage 
> collection,
>   but before stating data files copy operation.
>>    >> Generally, we cannot give a guarantee that operations that are
>>    >> performed during backup are reflected in the backup.  If I have
>>    >> understand correctly, transactions that commits after the data 
>> copying
>>    >> is finished, will not be reflected.  Since a user will not be 
>> able to
>>    >> distiguish between operations committed during data copying and
>>    >> operations committed during log copying, he cannot be sure 
>> concurrent
>>    >> operations is reflected in the backup.
>>    >>    >>
>>    ST> I agree with you that , one can not absolutely guarantee that
>>    ST> backup will include operations committed till a particular
>>    ST> time are included in the backup.  But the backup design
>>    ST> depends on the transactions log to bring the database to
>>    ST> consistent state , because when data files are being copied ,
>>    ST> it is possible that some of the page are written to the disk.
>>    ST> So we need the transaction log until the data files are copied
>>    ST> for sure. If a user commits a non-logged operation when data
>>    ST> files are being copied , he/she would expect it to be there in
>>    ST> the backup, similar to a logged operation.
>> My point was that a user will not be able to distiguish between the
>> data file copying period and the log copying period.  Hence, he does
>> not know whether his operation was committed while the data files was
>> being copied.
>>    ST> Please note that non-logging operation in Derby are not
>>    ST> explicit to the users, most of non-logging stuff is done by
>>    ST> the system without the user knowledge.
>> I understand.
>>    >> This is not more of an issue for a new backup mechanism than it is
>>    >> currently for roll-forward recovery.  Roll-forward recovery 
>> will not be
>>    >> able recover non-logged operations either.
>>    ST> Yes. Roll-forward recovery has same issues, once the log
>>    ST> archive mode that is required for roll-forward recovery is
>>    ST> enabled all the operations are logged including the operations
>>    ST> that are not logged normally like create index.  But I think
>>    ST> the currently derby does not handle correctly . it does not
>>    ST> force logging for non-logged operations that were started
>>    ST> before log archive mode is enabled.
>> The cheapest way to handle non-logged operations that started before
>> backup/archive mode enabling, is to just make them fail and roll them
>> back.  I think that would be an acceptable solution.
>  I like the idea,  but  I am not sure how users will react  if  an 
> operation fails in the middle because of  a  backup/archive mode
>  I think one of the following options may be more acceptable:
>                    1)  make backup/archive mode process wait until  
> all the transaction that has the non-logged
>               operations  are  committed.
>                      or
>            2)  convert the non-logged operation to logging mode after 
> flushing the containers , once  the backup starts.
> My preference is option 1) ,  it might be less complicated than option 
> 2).
 On my way back home ,  I  was thinking  may be better option  is :

    3)  to make  backup/(log archive mode enabling) fail ,  if there 
are  uncommitted transactions with non-logged operations ?  instead of
        making backup process wait for the non-logged operations to 

>>    >> If users needs that, we
>>    >> should provide logged version of these operations.
>>    >>    >>
>>    ST> I think, during  backup non-logged operations should be  
>> logged by the
>>    ST> system or block them.
>> I think blocking them should be acceptable to most users.
> I think converting non-logged operations to logged operations  may be 
> a  better choice. If a user wants to  create indexes/import small 
> amount of data during backup , they will still be able to do.
> In case if user is concerned about performance of these operation , 
> they can stop the
> backup or wait until backup is done. If the  database is in the log 
> archive mode ,
> they can disable using  SYSC_UTIL.SYSC_DISABLE_LOG_ARCHIVE_MODE.
> and  re enable the archive mode with a fresh full backup.
>>    ST> If user is really concerned of performance they will not
>>    ST> execute them in parallel.
>> This advice may work for backup, but not for enabling roll-forward
>> recovery.  If I was user that was concerned with performance, I think
>> I would prefer to still create an index unlogged and rather recreate
>> it if recovery is needed.  (I guess this would require roll-forward
>> recovery to ignore updates to non-existing indexes.)  I could limit
>> the vulnerability by making a backup after unlogged operations have
>> been performed.
>   I like the idea of rebuilding the indexes during recovery ,   but we 
> may want to do it as
>   a different project.  
>> By the way, how is normal recovery of unlogged operations handled? Is
>> the commit of unlogged operations delayed until all data pages created
>> by the operation have been flushed to disk?
>  Yes.  I think at the commit time all  unlogged containers pages in 
> the cache are  flushed to the disk.
>  To my knowledge,  all the non logged operation happen on  new 
> containers and the container creation part is  logged ,
>  if a crash occurs before the commit .  container will be dropped by 
> the rollback of the CREATE log record.
> Thanks
> -suresh

View raw message