db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jørgen Løland (JIRA) <j...@apache.org>
Subject [jira] Commented: (DERBY-2872) Add Replication functionality to Derby
Date Tue, 06 Nov 2007 10:39:51 GMT

    [ https://issues.apache.org/jira/browse/DERBY-2872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12540400
] 

Jørgen Løland commented on DERBY-2872:
--------------------------------------

Dan,

Thanks for showing interest in replication. I'll answer your questions inline, and will update
the func spec with the results of the discussion later.

> The startslave command's syntax does not include a -slavehost option, but the comments
seem to indicate one is available.

You are right; will fix.

> How do startmaster and stopmaster connect to the master database?

In the current prototype implementation, all commands are processed in NetworkServerCommandImpl
by calling Monitor.startPersistentService(dbname, ...) and Monitor.findService(dbname,...).
The plan is to change this to connection url options later, i.e. 'jdbc:derby://host/db;startMaster=true';
Note that since startslave is blocking, the connection 'jdbc:...;startslave=true' call will
hang.

> Do stopslave and startfailover need options to define the slavehost and port, otherwise
how do they communicate with the slave?

Since "startslave" is blocked during LogToFile.recover, Monitor.startPersistentService does
not complete for this command. Calling Monitor.findService on the slave database does therefore
not work. 

A way around this is to let the thread that receives log from the master and writes it to
log file, check for a flag value every X second. A Hashtable could, e.g., be added to Monitor
with setFlag(dbname, flagvalue) and getFlag(dbname) methods. The stopslave/failover commands
then call Monitor.setFlag(slaveDBName, "failover"/"stopslave"). 

A potential problem with this is to authenticate the caller of the command since the AuthenticationService
of the slave database is not reachable. I think the best solution would be to only accept
failover/stopslave flags if the connection with the master is down. Otherwise, if the connection
is working, stop and failover commands should only be accepted from the master.


> It's unclear exactly what the startmaster and stopmaster do, especially wrt to the state
of the database. Can a database be booted and active when startmaster is called, or does startmaster
boot the database? Similar for stopmaster, does it shutdown the database?

The "startmaster" command can only be run against an existing database 'X'. If 'X' has already
been booted by the Derby instance that will have the master role, "startmaster" will connect
to it and:

1) copy the files of 'X' to the slave (other transactions will be blocked during this step
in the first version of replication -
   may be improved later by exploiting online backup)
2) create a replication log buffer and make sure all log records are added to this buffer
3) start a log shipment thread that sends the log asynchronously.

If 'X' has not already been booted, "startmaster" will boot it and then do the above.

The "stopmaster" command will

1) stop log records from being appended to the replication log buffer
2) stop the log shipper thread from sending more log to the slave
3) send a message to the slave that replication for database 'X' has been stopped.
4) close down all replication related functionality without shutting down 'X'

> Is there any reason to put these replications commands on the class/command used to control
the network server? They don't fit naturally there, why not a replication specific class/command?
From the functional spec I can't see any requirement that the master or slave are running
the network server, so I assume I can have replication working with embedded only systems.

Implementing this in the network server means that the blocking startslave command will run
in a thread on the same vm as the server. 

> How big is this main-memory log buffer, can it be configured?

In the initial version we use 10 buffers with size 32KB. 32K was chosen because this is the
buffer size used in LogAccessFileBuffer.buffer byte[], which are the units copied to the replication
buffer. We need multiple buffers so that it is possible to append log while the log shipper
is sleeping and when it is busy shipping an older chunk of log. The number of buffers will
probably be modified once we get a chance to actually test the funtionality. It will be configurable.

> extract - "the response time of transactions may increase for as long as log shipment
has trouble keeping up with the amount of generated log records."
> Could you explain this more, I don't see the connection between the log buffer filling
up and response times of other transactions. The spec says the replication is asynchronous,
so won't user transactions still be only limited by the speed at which the transaction log
is written to disk?

In the current design, log records that need to be shipped to the slave are appended to the
replication log buffer at the same time they are written to disk. If the replication log buffer
is full, the transaction requesting this disk write has to wait for a chunk of log to be shipped
before the log can be added to it. I realize that it is possible to read the log from disk
if the buffer overflows. This is a planned improvement, but is delayed for now due to limited
developer resources.

> The spec seems to imply that the slave can connect with the master, but the startmaster
command doesn't specify its own hostname or portnumber so how is this connection made?

The connection between the pairs will be set up by

1) the slave sets up a ServerSocket
2) the master connects to the socket on the specified slave
   location (i.e. host:port)
3) the socket connection can be used to send messages in both
   directions.

Thus, the slave does not contact the master - it only sends a message using the existing connection.


> Why if the master loses its connection to the slave will the replication stop, while
if the slave loses its connection to the master it keeps retrying? It seems that any temporary
glitch in the network connectivity has a huge chance of rendering the replication useless.
I can't see the logic behind this, what's stopping the master from keeping retrying. The log
buffer being full shouldn't matter should it, the log records are still on disk, or is it
that this scheme never reads the transaction log from disk, only from memory as log records
are created?

See answer to response time above. Reading log from disk in case of replication buffer overflow
is definitely an improvement, but we delay that improvement for now. It will be high priority
on the improvement todo-list.

> From reading between the lines, I think this scheme requires that the master database
stay booted while replicating, if so I think that's a key piece of information that should
be clearly stated in the functional spec. If not, then I think that the how to shutdown a
master database and restart replication(without the initial copy) should be documented.

Again, a correct observation. The master database has to stay booted while replicated.

> Add Replication functionality to Derby
> --------------------------------------
>
>                 Key: DERBY-2872
>                 URL: https://issues.apache.org/jira/browse/DERBY-2872
>             Project: Derby
>          Issue Type: New Feature
>          Components: Miscellaneous
>    Affects Versions: 10.4.0.0
>            Reporter: Jørgen Løland
>            Assignee: Jørgen Løland
>         Attachments: master_classes_1.pdf, poc_master_v2.diff, poc_master_v2.stat, poc_master_v2b.diff,
poc_slave_v2.diff, poc_slave_v2.stat, poc_slave_v2b.diff, poc_slave_v2c.diff, proof-of-concept_v2b-howto.txt,
proof_of_concept_master.diff, proof_of_concept_master.stat, proof_of_concept_slave.diff, proof_of_concept_slave.stat,
replication_funcspec.html, replication_funcspec_v2.html, replication_funcspec_v3.html, replication_funcspec_v4.html,
replication_funcspec_v5.html, replication_funcspec_v6.html, replication_script.txt, slave_classes_1.pdf
>
>
> It would be nice to have replication functionality to Derby; many potential Derby users
seem to want this. The attached functional specification lists some initial thoughts for how
this feature may work.
> Dag Wanvik had a look at this functionality some months ago. He wrote a proof of concept
patch that enables replication by copying (using file system copy) and redoing the existing
Derby transaction log to the slave (unfortunately, I can not find the mail thread now).
> DERBY-2852 contains a patch that enables replication by sending dedicated logical log
records to the slave through a network connection and redoing these.
> Replication has been requested and discussed previously in multiple threads, including
these:
> http://mail-archives.apache.org/mod_mbox/db-derby-user/200504.mbox/%3c426E04C1.1070904@yahoo.de%3e
> http://www.nabble.com/Does-Derby-support-Transaction-Logging---t2626667.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message