Return-Path: Delivered-To: apmail-db-derby-dev-archive@www.apache.org Received: (qmail 2782 invoked from network); 28 Mar 2008 09:18:45 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 28 Mar 2008 09:18:45 -0000 Received: (qmail 57725 invoked by uid 500); 28 Mar 2008 09:18:43 -0000 Delivered-To: apmail-db-derby-dev-archive@db.apache.org Received: (qmail 57686 invoked by uid 500); 28 Mar 2008 09:18:43 -0000 Mailing-List: contact derby-dev-help@db.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: Delivered-To: mailing list derby-dev@db.apache.org Received: (qmail 57677 invoked by uid 99); 28 Mar 2008 09:18:43 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 28 Mar 2008 02:18:43 -0700 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 28 Mar 2008 09:18:05 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 0B0E6234C0AF for ; Fri, 28 Mar 2008 02:16:25 -0700 (PDT) Message-ID: <82270663.1206695785044.JavaMail.jira@brutus> Date: Fri, 28 Mar 2008 02:16:25 -0700 (PDT) From: =?utf-8?Q?J=C3=B8rgen_L=C3=B8land_=28JIRA=29?= To: derby-dev@db.apache.org Subject: [jira] Commented: (DERBY-3562) Number of log files (and log dir size) on the slave increases continuously In-Reply-To: <372561205.1206042085162.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/DERBY-3562?page=3Dcom.atlassian= .jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D1258= 2949#action_12582949 ]=20 J=C3=B8rgen L=C3=B8land commented on DERBY-3562: -------------------------------------- Mike Matrigali writes: > I didn't quite follow all of this, and admit i am not up on replication. > It would be nice if this process was the exact code as the normal > checkpoint processing. So a checkpoint would be triggered and then > after it had done it's work it would do the appropriate cleanup. If you > do the cleanup too soon then redo recovery of the slave won't work - is > that expected to work or at that point to you just restart from scratch > from master. > The existing code that replay's multiple checkpoints may be wierd as it > may assume that this is recovery of a backed up database that is meant > to keep all of it's log files. Make sure to not break that. > Is there a concept of a "fully" recoverable slave, ie. one that is > supposed to keep all of it's log files so that it is recoverable in > case of a data crash. As I said may not be necessary as there is > always the master. Just good to know what is expected. Mike, Thank you for expressing your concerns. I'll do my best to explain why I th= ink the proposed solution will work. The patch adds functionality to the checkpoint processing used during recov= ery (LogToFile#checkpointInRFR). During recovery, the dirty data pages are = flushed to disk, and the log.ctrl file is updated to point to the new check= point currently being processed. With the patch [1], the log files that are older than the currently process= ed checkpoint's Undo Low Water Mark (undo LWM) are then deleted. The undo L= WM points to the earliest log record that may be required to do recovery [2= ]. Since the log files are processed sequentially and the data pages have b= een flushed, the undo LWM in the checkpoint is equally valid during recover= y (aka slave replication mode) as during normal transaction processing. Once replication has successfully started, the slave database will always b= e recoverable [3], but not in case of corrupted data blocks [4]. You may at= any time crash the Derby serving the slave database and then reboot it. Th= e used-to-be-slave database will then recover to a transaction consistent s= tate including the modifications from all transactions whose commit log rec= ord was written to disk on the slave before the crash. Please follow up if you think I may have misunderstood anything or did not = answer your questions good enough. [1] The patch only applies to slave replication mode. Backup is not affecte= d as to not break the "fully" recoverability feature for backups. [2] The first log record of the oldest transaction in the checkpoint's tran= saction table. [3] If "fully" recoverable means recovering in presence of corrupted data b= locks, this is currently not supported for replication. [4] Not including jar files, as explained in DERBY-3552. > Number of log files (and log dir size) on the slave increases continuousl= y > -------------------------------------------------------------------------= - > > Key: DERBY-3562 > URL: https://issues.apache.org/jira/browse/DERBY-3562 > Project: Derby > Issue Type: Bug > Components: Replication > Affects Versions: 10.4.0.0, 10.5.0.0 > Environment: - > Reporter: Ole Solberg > Assignee: J=C3=B8rgen L=C3=B8land > Attachments: derby-3562-1a.diff, derby-3562-1a.stat, master_slave= -db_size-6.jpg > > > I did a simple test inserting tuples in a table during replication: > The attached file 'master_slave-db_size-6.jpg' shows that=20 > the size of the log directory (and number of files in the log directory) > increases continuously during replication, while on master the size=20 > (and number of files) never exceeds ~12Mb (12 files?) in this scenario. > The seg0 directory on the slave stays at the same size as the master=20 > seg0 directory. --=20 This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.