db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Matrigali <mikem_...@sbcglobal.net>
Subject Re: [jira] Commented: (DERBY-239) Need a online backup feature that does not block update operations when online backup is in progress.
Date Sat, 29 Oct 2005 15:28:24 GMT
comments in line

Suresh Thalamati wrote:
> Hi Øystein,
> 
> Thanks for reviewing the patch. My answers are in-line...
> 
> Øystein Grøvlen (JIRA) wrote:
> 
>>     [ 
>> http://issues.apache.org/jira/browse/DERBY-239?page=comments#action_12355620 
>> ]
>> Øystein Grøvlen commented on DERBY-239:
>> ---------------------------------------
>>
>> I have looked at the onlinebackup_1.diff patch, and overall the patch
>> looks good.  However, I think there is one severe bug in
>> RAFContainer.java.  I guess due the cut and paste from writePage(),
>> you are seeking in another file than the one you are writing
>> to (fileData.seek(), but backupRaf.write()).
>>
> 
> filedata.seek() in the backup routine is wrong, I will fix it. Good Catch.
> 
>> I have a few other comments and questions:
>>
>> General:
>>     * If I run backup with this patch, it seems like I will run the
>>       new code.  Does one not need to change the restore code to be
>>       able to restore restore such a backup, or does that the
>>       ordinary recovery handle that?
>>
> 
> Backup structure is not changed, so current file restore code should 
> work ok. Some of the new issues related to online backup  are log replay 
> issues like, redoing CREATE CONTAINER, new page creation ..etc. I think 
> most of these issues are addressed as part of the  rollforward recovery. 
> I plan to rescan the restore code and do some more testing to find any 
> cases that are not handled  by the current code.
> 
>> RawStore.java:
>>     * The BACKUP_FILTER now contains so much, that it would be useful
>>       to have a comment that says what is really left to copy.
> 
> 
> I agree, comments will be helpful here. I am not sure BACKUP_FILTER will 
> be be needed any more. Currently it just does not filter JAR dir, I plan 
> to  make JARS also as separate copy after the data segment as discussed 
> in my previous e-mail.
> 
> 
>>     * Intuitively, it seems wrong to hard-code "seg0", but I see that
>>       this is done all over the code.  
> 
> 
> Could you please file a JIRA entry for this one. I noticed it too, hard 
> coding is definitely not the correct way here. One of us can clean up this.

Yes this is probably incorrect in a number of places.  There currently 
is only one segment.  The segment number was added in the beginning so 
that if we ever had to do data partitioning ourselves we had a chance to
add it without an ugly upgrade.  But as you are seeing over time some
hard coding has happened and if we ever really want another segment it
will have to be cleaned up.

My opinion is that the OS can do better data partitioning then we can so
I don't think it is very useful to add that feature.
> 
> 
>> Will there always be only one   segment?  
> 
> 
> Yes, at least for now. It is hard wired some where in the access layer I 
> think.
> 
> 
>> What is then the purpose of the segment concept?
> 
> 
> I think initial design plan was to support some kind of data 
> partitioning by storing fragments of tables/indexes on different 
> segments or distribute tables into different ssegments. Each segment 
> then can be placed on on different disks for performance reasons or to 
> avoid derby database size being limited to a single disk/volume.
> 
> 
>>     * backup(File ...) seems like it would create an endless recursion
>>       if called.  Fortunately, it seems like it never will be
>>       called. Why do we need the methods with a File parameter instead
>>       of a String.  The system procedures uses the String variant.
> 
>  >       Maybe we could just remove the File variant?
> 
> I think they are left overs from the old cloudscape versions, File 
> variant is not supported any more , these can be cleaned up. There 
> should be some File variant calls in access layer also, Could you please 
> file Jira for this cleanup, so that these won't  forgotten again.
> 
> 
>>     * I think it would be helpful with a comment that explained why
>>       disableLogArchiveMode() was made synchronized.
> 
> 
> Yes, I definitely need to add comments about this synchronization.
> This was part of my attempt to prevent parallel backup actions, which I 
> did not complete as part of this patch.
> 
>>
>> BaseDataFileFactory.java:
>>     * I do think basing which files to back up on the contents of the
>>       seg0 directory is very robust.  What if someone by accident has
>>       written a file with a name that matches the pattern you are
>>       looking for.  Then I would think you may get a very strange
>>       error message that may not be easy to resolve.  Could not this
>>       be based on some system catalog? 
> 
> 
> I think users should not be creating/deleting any files in SEG0 to start 
> with. Pattern matching is just a simple precaution. I thought of 
> scanning the catalogs to find the container to backup, but could not 
> convince my self that it is required at least for now.
> 
> If you think scanning the catalogs is more robust, this enhancement can 
> be done at later point of time easily.

Be careful if you go down this route to make sure you pick up the files
that aren't in the catalogs.  I know the property conglomerate is not,
there may be others.  I think scanning is fine for initial 
implementation of online backup.
> 
> 
>>  Another scenario is if someone
>>       by accident deletes a file for a table that is not accessed very
>>       often.  A later backup will then not detect that this file is
>>       missing.  Since the backup is believed to be succesful, the
>>       latest backup of this file may be deleted.
>>
> 
> Backup is as good as the database in this case :-)  If users access the 
> deleted file table from the backup DB , it will fail with NO container 
> error, as it will on the main database.
> 
> By using the catalog approach also backup can only throw a warning
> about the  deleted container. Once the user deletes a container file ,
> there is no way out, users can not even drop that table. I don't think
> making backup fail forever when this scenario occurs will be acceptable 
> to the users.
> 
> 
>> FileContainer.java:
>>     * I cannot find any backup-specific about getPageForBackup() so I
>>       think a more general name would be better (e.g, getLatchedPage).
>>
> 
> Yes, currently this routine does not  have any backup specific stuff in 
> this routine. I would like to get the page from the cache with specific 
> weight for the backup page in future, when the weight based cache 
> mechanism is implemented.
> 
> 
>> RAFContainer.java:
>>     * The changes to this file seems not to be quite in line with some
>>       of the original design philosophies.  I am not sure that is
>>       necessarily bad, but it would be nice to here the arguments for
>>       doing it this way.  More specifically:
>>           - While RAFContainer so far has used the
>>             StorageRandomAccessFile/StorageFile abstractions, backup
>>             use RandomAccessFile/File directly.  Is there a particular
>>             reason for that?
> 
> 
> Yes. Currently there is No API available for the users to specify the 
> type of Storage Factory to use for the backup. It will be good 
> enhancement to add later.
> 
> Online backup implementation is doing what the current backup does;
> i.e use the disk IO calls directly , instead of the database Storage 
> Factory. I think  one reason to do this way is if a database is using 
> in-memory Storage Factory, users can backup the database on to a disk. 
> If Storage Factory used by the Database is java.io.* , then backup is 
> doing the same without going through the Storage Factory interface.
> 
> 
>>           - In order to be able to backup a page, new methods have
>>             been added to FileContainer and BasePage/CachedPage, and
>>             the RAFContainer is doing latching of pages.  This
>>             increases coupling to other modules, have alternative
>>             designs been considered?
> 
> 
> 
> I was considering doing following minor changes here:
> 
> 1) Instead of getting pageData byte array from the Page Object, make
> a request to the page to write it self to backup(like page.backup(....).
> 
> 2) Move the backing up and unlatching of the page code to the 
> FileContainer.java.
> 
> I have not thought of any other better ways to do design this, any ideas 
> will be helpful.
> 
> 
>>     * privBackupContainer():           - Setting 'done=true' at the 
>> start of the method is a bit
>>             confusing.  I would think another name for this variable
>>             would be better.
> 
> 
> I will change it to 'backupComplete = false' and reverse the exit logic, 
> hopefully that will make it more readable.
> 
>>           - If an exception is thrown while holding a latch, do one
>>             not need to relase the latch?
> 
> 
> Yes. Latch has to be released. I will make sure no latches are held when 
> an error occurs
> 
>>     * writeToBackup():           - copies a lot of code from 
>> writePage().  One should
>>             consider factoring out common code.
> 
> 
> I will move the common code into a separate routine.
> 
>>           - Due the cut and paste, you are seeking in another file
>>             than the one you are writing to. (fileData.seek(), but
>>             backupRaf.write())
>>
> 
> you are right, I will fix it.
> 
>> LogToFile.java:
>>     * In getFirstLogNeeded why did you need to change
>>       getFirstLogNeeded() to handle a null checkpoint?  Is it in case
>>       you do backup before any checkpoint has been performed?
> 
> 
>    No. There will be a checkpoint before the backup. I added NULL check 
> for testing drop table stub cases without the checkpoint and I also 
> noticed a NULL pointer error when shutting down on corrupt cases if a 
> checkpoint not occurred by that time. so thought there is no harm in 
> adding a NULL check there.
> 
> 
>>     * The javadoc for startLogBackup() says that 'log files are copied
>>       after all the data files are backed up, but at the end of the
>>       method log files are copied.
>>
>> ReadOnly.java/InputStreamContainer.java:      * I do not think javadoc 
>> should just be a copy of the
>>       interface/superclass, but say something about what is particular
>>       to this implementation.  In this case, the Javadoc should say
>>       that nothing is done.
>>
>> typos:
>>     * several occurences of 'backedup'
>>     * LogToFile.java: 'eventhough'
>>     * RAFContainer.java: 'pahe cache', 'conatiner'
>>     * 'Standard Cloudscape error policy'.  Should this be changed to
>>       Derby?
> 
> 
> 
> I will correct the comments.
> 
> 
>>
>> Pet peeves:
>>     * Should have Javadoc on all methods, parameters and return
>>       values.
>>     * Lines longer than 80 chars.
>>
> 
> Sound like a good ones to follow. I will add the java doc and fix any 
> long lines.
> 
> Thanks
> -suresh
> 
> 
> 
> 


Mime
View raw message