From user-return-21547-apmail-couchdb-user-archive=couchdb.apache.org@couchdb.apache.org Fri Jul 13 06:56:10 2012 Return-Path: X-Original-To: apmail-couchdb-user-archive@www.apache.org Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B0A5BDEFA for ; Fri, 13 Jul 2012 06:56:10 +0000 (UTC) Received: (qmail 38464 invoked by uid 500); 13 Jul 2012 06:56:09 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 38323 invoked by uid 500); 13 Jul 2012 06:56:08 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 38311 invoked by uid 99); 13 Jul 2012 06:56:08 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 13 Jul 2012 06:56:08 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FSL_RCVD_USER,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of mleppich@googlemail.com designates 74.125.82.50 as permitted sender) Received: from [74.125.82.50] (HELO mail-wg0-f50.google.com) (74.125.82.50) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 13 Jul 2012 06:56:03 +0000 Received: by wgbds11 with SMTP id ds11so518855wgb.31 for ; Thu, 12 Jul 2012 23:55:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=20120113; h=sender:content-type:mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to:x-mailer; bh=cpbyyXn+RS2QzGQpFZ8kG4oUxqboqiqNPsnQlVJwFiw=; b=0/KrjzbppwqQQP/kzzVOv5amrm/fYWIkoEHWNfhsGv8D2rQw5bMUU4iIY+lCDlI2iN AENc1Ho4X/uaAXHitllsmCtBrq09Q1YxR1awpGDETey5bX1xih9AA1ROaJzh7doS+elA gQski54Hzclo3dwMdzYFmvjkEe74OiBmKz5/BdK9mju4fbLp4kUFr4tC4BFvyFm2PaWC hD9eEv6M6m5O6w4vMXMdsX7Dr1l7BySDX4ao0B/IOEE6DU+mpDhw0ZMeexhK9JjrVgGN /LivWrwu+Oiq0ewBNYhq1+V3JQ/9LPcjuzsxTt42F7hMXVHxtRSXKGVVhTMX/srVfVrk kQNw== Received: by 10.180.14.8 with SMTP id l8mr87492wic.6.1342162541605; Thu, 12 Jul 2012 23:55:41 -0700 (PDT) Received: from [192.168.2.230] (port-22604.pppoe.wtnet.de. [46.59.149.250]) by mx.google.com with ESMTPS id ep14sm2161379wid.0.2012.07.12.23.55.40 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 12 Jul 2012 23:55:40 -0700 (PDT) Sender: Mathias Leppich Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Apple Message framework v1278) Subject: Re: Replication and checkpoints - what to expect? From: Mathias Leppich In-Reply-To: <1342139605.79508.YahooMailNeo@web39405.mail.mud.yahoo.com> Date: Fri, 13 Jul 2012 08:55:41 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: <4DB46AA2-91CC-4B61-84FC-D1480EB071F4@muhqu.de> References: <1340152586.21235.YahooMailNeo@web39403.mail.mud.yahoo.com> <1340297453.92738.YahooMailNeo@web39405.mail.mud.yahoo.com> <04EAF841-633C-44F5-954B-32988F653892@apache.org> <1340306882.46820.YahooMailNeo@web39401.mail.mud.yahoo.com> <1342049498.77413.YahooMailNeo@web39406.mail.mud.yahoo.com> <67CB5E3C-B520-4688-9C66-11C26069EEF5@gmail.com> <1342139605.79508.YahooMailNeo@web39405.mail.mud.yahoo.com> To: user@couchdb.apache.org, Andreas Kemkes X-Mailer: Apple Mail (2.1278) X-Virus-Checked: Checked by ClamAV on apache.org Oh, now I'm getting you point!=20 The issue is that a filtered replication doesn't periodically checkpoint = if no changes match the filter. This looks like a gap in the continuous = replication "protocol". The filtered _changes feed that is used by the = continuous replication should emit "empty" changes (only seq) when used = with the heartbeat parameter. Your idea of a "synchronization" document that passes all filters seems = like a pretty straight fwd workaround.=20 - mathias On Jul 13, 2012, at 2:33 , Andreas Kemkes wrote: > Mathias: >=20 > I had planned to not allow new entries into the source to let the = continuous replications catch up, but I don't see how your approach = changes the conundrum that "source_seq" - "checkpointed_source_seq" will = only be 0 for the exceptional case that the last entry into the source = gets through the filter. >=20 > Given full coverage, there will be at least one replication globally = where this value is indeed zero. Maybe a document that doesn't get = filtered by any of the replications is the workaround (do-it-yourself = boundary synchronization). >=20 > I like your idea of graphing the "lag of changes". That may come in = handy in other replication patterns. >=20 > Thanks, >=20 > Andreas >=20 > From: Mathias Leppich > To: user@couchdb.apache.org; Andreas Kemkes =20 > Sent: Thursday, July 12, 2012 12:13 AM > Subject: Re: Replication and checkpoints - what to expect? >=20 > Hi Andreas, >=20 > with continuous replications and an ever changing dataset there is no = point where you can tell your replication is "up-to-date" in terms of = "100% replicated" as replication always happens after the data has been = written to the source database. (which is a good thing) >=20 > You need to change the way you measure the up-to-date-ness. Instead of = measuring the percentage of completion you should better me sure the lag = of changes. e.g. targetDB is N changes behind sourceDB.=20 >=20 > With couchdb 1.2 you get this number with a single request to = /_active_tasks by calculating a replications "source_seq" - = "checkpointed_source_seq". Prior to 1.2 you can get this number too but = its a more difficult because you have to know the replications _local ID = and check the "source_last_seq" field in the replications session = document=85=20 >=20 > Once you have the "lag of changes" for your continuous replications = its a good thing to graph it with some monitoring tool to get a big = picture of how replication performance is going through the day. >=20 > - mathias >=20 > On Jul 12, 2012, at 1:31 , Andreas Kemkes wrote: >=20 > > I wanted to follow up on this thread as I'm still experience = difficulties using the feature and would like some advise how to best = deal with the situation. > >=20 > > The goal is to break up a monolithic database into multiple, which = was achieved after a lot of trial and error. Now the quest is to keep = it in sync for a while by using filtered, continuous replications. Yet = the replication gets stuck on the last sequence number that passes the = filter. In the Futon UI, I see: > >=20 > > Checkpointed source sequence 165850, current source sequence 166253, = progress 99% > >=20 > > If I start a non-continuous replication with the exact same = parameters, it returns: > >=20 > >=20 > > { > > "ok": true, > > "no_changes": true, > > "session_id": ..., > > "source_last_seq": 165850, > > "replication_id_version": 2, > > ... > > } > > It apparently knows that there are no changes and it knows the = current source sequence. Why could it not move the checkpointed source = sequence forward to match the current source sequence? What am I = missing? > >=20 > > Unless there is an exact match between checkpointed and current = source sequence, how would one ever know if a replication is up-to-date? > >=20 > > -- Andreas > >=20 > >=20 > > ________________________________ > > From: Filipe David Manana > > To: user@couchdb.apache.org; Andreas Kemkes =20 > > Sent: Thursday, June 21, 2012 12:40 PM > > Subject: Re: Replication and checkpoints - what to expect? > >=20 > >> The same should be true for filtered replications if there is no = applicable document between the current source sequence and the last = checkpoint. Otherwise you would be always wondering if it has been = replicated entirely. > >=20 > > That's harder. With filtered replication, we only know about = sequence > > numbers of changes that pass the filter. >=20 >=20 >=20