db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "V.Narayanan (JIRA)" <j...@apache.org>
Subject [jira] Commented: (DERBY-2872) Add Replication functionality to Derby
Date Mon, 16 Jul 2007 11:03:04 GMT

    [ https://issues.apache.org/jira/browse/DERBY-2872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12512911
] 

V.Narayanan commented on DERBY-2872:
------------------------------------

>Great that you guys are running with this one! Some comments to the
>functional specification:

Thank you for the reviews and comments Dag.

>* Derby doesn't log all operations by default, e.g. bulk
>  import, deleting all records from a table, creating an index.  These
>  issues were addressed in the work on online backup (DERBY-239),
>  partially by denying online backup if non-logged operations are not
>  yet committed, and partly by making them do logging when online
>  backup is in effect (the reason for not logging some operations is
>  performance). I guess for replication you would need to make them
>  do logging for the duration.

I agree you are very correct. 

I thought I could get some idea as to how this could be done by going through 
the patches Derby-239.

I read through Derby-239 and found patches
onlinebackup(3&7).diff to be of interest to us.

The primary motivation of 3 was the following 

"To make a consistent online backup in this scenario, this patch:

1) blocks online backup until all the transactions with unlogged operation are
    committed/aborted.
2) implicitly converts all unlogged operations to logged mode for the duration
    of the online backup, if they are started when backup is in progress. "

7 addressed comments on 3.

>* Overview of characteristics: you mention the network line as a
>  single point of failure, that's fine in a first version. One could
>  imagine having the replication service support more network interfaces 
>  to alleviate this vulnerability.

I agree!

Some random thoughts on the modifications that would be required.

when the master or slave has multiple network interfaces each of them
can be accessed using different IP Addresses to which the network interfaces
would be bound. 

* The start replication command on the master and slave 
  will have to be modified to accept the multiple IP addresses
  of the peer.

* The log sender should be capable of detecting failure to send
  to one and switch to sending to the other.

* The log receiver should be modified to be able to listen at both
  the interfaces.


>* Fail-over: When fail-over is performed (with a command on the
>  slave), I assume will the master be told to stop its replication so
>  it can tidy up? Since you describe the semantics of the stop
>  replication command as shutting down the slave database, I assume
>  failover can be performed without requiring a prior stop replication
>  command on the master.

you are correct. We would not issue a stop on the master to complete a
fail-over.

>  If reaching master is not possible (lost connection), can the stop
>  replication command be used against the master to tidy up even when
>  connection has been lost?

Not being able to reach the master would mean that the replication process
should automatically stop. The sender should quit trying to send logs.
But this case would arise when you try the stop command between the time the
master tries to connect to the slave and the master automatically stops 
replication.

In this case a stop command should close down all replication behaviour, 
independently of whether a connection is actually established. the only 
difference is that if the connection is ok, the slave is shut down as well

> Perhaps it would be good to include in your table of commands any
> preconditions for the commands. BTW, It seems a good idea to not
> impose an order on starting server or slave first.

Will add a column preconditions for each command and will fill-up the
information for them. I will submit a v4 of the func spec for this.

>* Presumably, the master will time out if it is unable to send logs to
>  the slave for some (configurable?) period. It could keep trying for
>  some time but eventually it would need to stop or overflow the
>  buffer mechanism you suggest in DERBY-2926. Will you require that
>  the user has LOG_ARCHIVE_MODE enabled? If you do, it would seem a
>  nice addition to later be able to resume replication even if the
>  buffer had to be abandoned (as you suggest). *Before* the buffer is
>  abandoned, if the network becomes available again, it would be
>  trivial to resume shipping of logs I expect?

I agree this could be a great addition, because if the buffer overflow did result
we could resume sending logs from the backed up logs. 


>* Given that the system privileges work of DERBY-2109 provides us with
>  necessary security, I would hope we can lift the restriction that
>  administration commands can only be run from the same machine as the
>  server is started on, but for the time being the restriction makes
>  sense.

I agree.

>* You describe the replication commands as CLI commands against
>  NetworkServerControl; will you be making the commands available in
>  API form as well, so replication can be embedded in an application?

The slave is never booted during the time it is receiving logs. Hence 
we would not be able to actually use a stored procedure here. This was 
the reason we had earlier decided on a CLI command agains NetworkServerControl. 

However if by public API you mean public methods in NetworkServerControl that 
can be reached from outside if anyone wants to code an admin program at a later 
stage, I think this is a great idea and can be easily done.

>* typos: "enclypted", "it's local log"

  Will fix this in the next version.

> Add Replication functionality to Derby
> --------------------------------------
>
>                 Key: DERBY-2872
>                 URL: https://issues.apache.org/jira/browse/DERBY-2872
>             Project: Derby
>          Issue Type: New Feature
>          Components: Miscellaneous
>    Affects Versions: 10.4.0.0
>            Reporter: Jørgen Løland
>            Assignee: Jørgen Løland
>         Attachments: proof_of_concept_master.diff, proof_of_concept_master.stat, proof_of_concept_slave.diff,
proof_of_concept_slave.stat, replication_funcspec.html, replication_funcspec_v2.html, replication_funcspec_v3.html,
replication_script.txt
>
>
> It would be nice to have replication functionality to Derby; many potential Derby users
seem to want this. The attached functional specification lists some initial thoughts for how
this feature may work.
> Dag Wanvik had a look at this functionality some months ago. He wrote a proof of concept
patch that enables replication by copying (using file system copy) and redoing the existing
Derby transaction log to the slave (unfortunately, I can not find the mail thread now).
> DERBY-2852 contains a patch that enables replication by sending dedicated logical log
records to the slave through a network connection and redoing these.
> Replication has been requested and discussed previously in multiple threads, including
these:
> http://mail-archives.apache.org/mod_mbox/db-derby-user/200504.mbox/%3c426E04C1.1070904@yahoo.de%3e
> http://www.nabble.com/Does-Derby-support-Transaction-Logging---t2626667.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message