From dev-return-9628-apmail-couchdb-dev-archive=couchdb.apache.org@couchdb.apache.org Mon Apr 12 08:52:22 2010 Return-Path: Delivered-To: apmail-couchdb-dev-archive@www.apache.org Received: (qmail 62417 invoked from network); 12 Apr 2010 08:52:22 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 12 Apr 2010 08:52:22 -0000 Received: (qmail 20753 invoked by uid 500); 12 Apr 2010 08:52:21 -0000 Delivered-To: apmail-couchdb-dev-archive@couchdb.apache.org Received: (qmail 20560 invoked by uid 500); 12 Apr 2010 08:52:21 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 20548 invoked by uid 99); 12 Apr 2010 08:52:20 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 12 Apr 2010 08:52:20 +0000 X-ASF-Spam-Status: No, hits=0.2 required=10.0 tests=AWL,FREEMAIL_FROM,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of robert.newson@gmail.com designates 74.125.82.180 as permitted sender) Received: from [74.125.82.180] (HELO mail-wy0-f180.google.com) (74.125.82.180) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 12 Apr 2010 08:52:14 +0000 Received: by wyf22 with SMTP id 22so609107wyf.11 for ; Mon, 12 Apr 2010 01:51:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:received:message-id:subject:from:to:content-type :content-transfer-encoding; bh=m4HY4qBeL8BgjYme4MRqZVkZaU8hpCwadNyoq43g7Go=; b=R3IvoAZuLQBlPatA/OZV52WdZ5OBetvI2TkN7qKr7r9QgLSfxG6SaNqJolLb4i+g7a 9wOP/dcKjz3/FyBF2UEWBNTNaxKmGvJukMLXEtgxrsEFp/AeCDZ3fKElYDNWIQLd7fRv PNxljQCYdYAvxgcDHd4rcTQU77jsQ7xcu+LfE= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=Q6V4iKtD7tslFAaw8R95VIr+UPg/T+hZM+xrfdCOfy4rdq59KQ6W8N68qFsK8ktfsk ivO9li/PaDKY9yQhlx6LsrennkvGvTo/3N9NlQJ0DsvSQZZDiwtTPUaEYJzoa1I8L9ib k8v1uwSBi4rVkrK6Daux5yFCPqkN7QvVjOhJI= MIME-Version: 1.0 Received: by 10.216.72.11 with HTTP; Mon, 12 Apr 2010 01:51:52 -0700 (PDT) In-Reply-To: References: Date: Mon, 12 Apr 2010 09:51:52 +0100 Received: by 10.216.87.147 with SMTP id y19mr2092634wee.136.1271062312866; Mon, 12 Apr 2010 01:51:52 -0700 (PDT) Message-ID: Subject: Re: ensuring an update_seq is used at most once From: Robert Newson To: dev@couchdb.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Would it be safer to have a low- and high- watermark for the update_seq in memory? What I mean is that the db writer will never write out an update_seq that is N higher than the last committed one; if it is forced to do so, to permit a write, it then fsync's and resets high_seq to last_committed_seq. This way you can genuinely ensure that you don't reuse an update_seq. In practice we could allow a large delta, one that is larger than the number of fsyncs we expect to manage in the commit interval. Your idea to just bump the update_seq "significantly" mostly pans out (I know a system that does precisely this) but it would be a data loss scenario if when it doesn't pan out. B. On Mon, Apr 12, 2010 at 3:54 AM, Adam Kocoloski wrote: > Currently a DB update_seq can be reused if there's a power failure before= the header is sync'ed to disk. =A0This adds some extra complexity and over= head to the replicator, which must confirm before saving a checkpoint that = the source update_seq it is recording will not be reused later. =A0It does = this by issuing an ensure_full_commit call to the source DB, which may be a= pretty expensive operation if the source has a constant write load. > > Should we try to fix that? =A0One way to do so would be start at a signif= icantly higher update_seq than the committed one whenever the DB is opened = after an "unclean" shutdown; that is, one where the DB header is not the la= st term stored in the file. =A0Although, I suppose that's not an ironclad t= est for data loss -- it might be the case that none of the lost updates wer= e written to the file. =A0I suppose we could "bump" the update_seq on every= startup. > > Adam > >