db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jørgen Løland <Jorgen.Lol...@Sun.COM>
Subject Re: Derby Transaction Log Shipping
Date Fri, 08 Feb 2008 12:33:16 GMT
Hi Duncan,

First of all, the scenario you describe seems (to me) to be solved by 
the new replication functionality. However, I think it can be done the 
hard way with a plan similar to what you describe. Here goes :)

Log files can be found in <database_dir>/log. When you enable log 
archive mode, the log files will not be deleted. Hence, you do not need 
to perform backup on day 2 and 3 - you may simply copy the log files 
from the <database_dir>/log directory.

So, ideally, the steps would be like this:

Day 1: make a backup, copy it to the secondary location. Boot the 
secondary db and check that it is all ok
Day 2: copy the log files generated since the backup was made
Day 3: copy the log files generated since the backup was made
Day 4: boot secondary db, which now is in the same state as the primary 
was in when the log was copied on day 3.

With a few modifications, this should work just fine:

Problem, day 1: Assuming that users are allowed access to the primary 
database when you make the first backup (as indicated by your scenario), 
the data pages and log files will contain information from uncommitted 
transactions. When you boot the secondary to check that everything is 
ok, Derby will go through the same steps as when doing crash recovery. 
That means going through a redo phase (redoing operations in the log 
that are not reflected in the data pages) and an undo phase (basically 
abort transactions that were active at the time the backup made). The 
undo phase is key here because Derby do operations on the data pages of 
the secondary that were not done on the primary. This is fine if you 
want to use the secondary, but not if you want to keep sending it log files.

Solution: Don't allow any active transactions when you make the initial 
backup or (probably better in your scenario) don't boot the secondary 
database to check if it is ok. Wait until the primary has failed before 
booting it.

Problem, day 2 and 3: The log file with highest number copied on day 1 
(say logN.dat) may have been modified since you copied it.

Solution: Overwrite the secondary log file logN.dat with logN.dat from 
the primary database.

I think that should do it, but if you do not require this NOW, I would 
rather wait for replication in 10.4.

Good luck,
Jørgen


Duncan Groenewald wrote:
> I still don't know if I really understand the Derby model as it seems 
> the transaction logs are archived when a database backup is run.  So 
> here is a scenario:
> 
> 
> Day 1:  Backup Primary Derby (enabling logging), copy backup database to 
> secondary server and boot secondary server to check it is all OK.
> Day 2:  Backup Primary Derby DB and copy archived log files to secondary 
> server.
> Day 3:  Backup Primary Derby DB and copy new archived log files to 
> secondary server.
> Day 4:  Boot secondary Derby DB to check its OK...  In theory then the 
> boot process will replay all the log files and the database should be in 
> the same state as the Primary was on Day 3 ?
> 
> 
> Somehow I don't think this would actually work - but I will give it a 
> try...
> 
> Here is the scenario I am try to cater for:
> 
> 24x7 realtime system needs to be relocated to another site (or needs to 
> have a warm standby system that can be enabled in 15 minutes or less).
> Basic approach is to have two databases running and logs from the 
> primary are loaded on the secondary within a couple of minutes of them 
> being written.
> Transaction dumps on primary database are written to timestamped files 
> and file is renamed  TRXDUMP20080206091545212_DONE.DAT once dump write 
> process has completed.  A script checks for presence of *_DONE.DAT files 
> every 30 seconds and copies file to remote servers file system (or this 
> gets done by the dump process as well).  Script on the remote server 
> checks for presence of *_DONE.DAT files every 30 seconds and runs a 
> Transaction Load process on remote database to load the dump files.  At 
> any given point in time the remote site is always within a few minutes 
> of the primary site.
> 
> It seems unlikely one could do this with Derby because there are no 
> commands to periodically dump the transaction logs or to load the 
> transaction logs.
> 
> Cheers
> 
> On 08/02/2008, at 7:05 PM, Knut Anders Hatlen wrote:
> 
>> Duncan Groenewald <dagroenewald@optusnet.com.au> writes:
>>
>>> Thanks - the specification looks like its close to what I would like.
>>> The model I work from is one used by Sybase (and possibly  others)
>>> where you can specify a database dump and a separate  transaction log
>>> dump at defined intervals using a script or some  other programmatic
>>> method.  From what I can tell its not possible to  do this with Derby,
>>> since you can only dump the database and not the  logs.  Its also
>>> unclear how you would load a log file on its own.
>>>
>>> What I would like to see is two additional commands added to dump
>>> transaction logs to specified directory or file name and another
>>> command to load a transaction log file from a specified location/
>>> file.  Ideally a transaction log file load should function much the
>>> same way a normal user does to allow concurrent user access while
>>> loading a transaction log file.
>>
>> Not exactly what you want (it won't allow concurrent user access while
>> loading the transaction log), but you may achieve something similar with
>> log archiving and roll-forward recovery, combined with some creative
>> scripts. I haven't tried it myself, but you may get some ideas here:
>> http://db.apache.org/derby/docs/dev/adminguide/cadminrollforward.html
>>
>> -- 
>> Knut Anders
> 
> Duncan Groenewald
> mobile: +61406291205
> email: dagroenewald@optusnet.com.au
> 
> 
> 
> 
> 


-- 
Jørgen Løland

Mime
View raw message