Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 6273B2009F4 for ; Thu, 26 May 2016 16:04:35 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 60EB9160A18; Thu, 26 May 2016 14:04:35 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 83FB4160A17 for ; Thu, 26 May 2016 16:04:34 +0200 (CEST) Received: (qmail 40274 invoked by uid 500); 26 May 2016 14:04:33 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 40261 invoked by uid 99); 26 May 2016 14:04:33 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 May 2016 14:04:33 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id AD235180547 for ; Thu, 26 May 2016 14:04:32 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.12 X-Spam-Level: X-Spam-Status: No, score=-0.12 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, MIME_QP_LONG_LINE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id bJdZLCV07ZTr for ; Thu, 26 May 2016 14:04:29 +0000 (UTC) Received: from mail-yw0-f173.google.com (mail-yw0-f173.google.com [209.85.161.173]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id A01F85F4EA for ; Thu, 26 May 2016 14:04:28 +0000 (UTC) Received: by mail-yw0-f173.google.com with SMTP id x189so77188629ywe.3 for ; Thu, 26 May 2016 07:04:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:content-transfer-encoding:mime-version:subject:message-id:date :references:in-reply-to:to; bh=h/3PsKi9YyLoiS8GwHiMzdV5fzXTZT7K/DH399rKMBs=; b=udFv5XWLbbBMiubacn0akRLfNEvU8hKcfERelVkqu9uBjEEk5LAuarzQBqZeUqnhT+ 4GeePzGr9ylIWrKhWPlG7bmrQFWrdFZ41iZGm/aTqbj7PmYqy57ZFm8MSKUacPK2jT7D 3nazypMXBQtIe9BKsAZ9A4kKeJVhnob4Oo8eMj1bY6X3bcW6WuvKm0nWVxnTIGC3MurR WvkFWBjeH15nSlF3AgmXQsyjoX/N0O6v6IhEf/gimNf5Y29tNUAs/IBaQWod5kJp0SMW cRagzgAcY//sN6+uIjMrFL6/39jGYZ3Mf3StlHJ4QD7aL0cvAWpapTinypdJSJcczJH4 IC+A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:content-transfer-encoding:mime-version :subject:message-id:date:references:in-reply-to:to; bh=h/3PsKi9YyLoiS8GwHiMzdV5fzXTZT7K/DH399rKMBs=; b=TA8MeNEhjqad7kWmrclgpJF32pT5H4gpxZxqou0edSxJMMi1EmF1qPPruFY80tKrg2 KARcpbyIuh83wqlN1iZb2ByzdLUxaC9l7bbOHoLnfjYf5qjrc+Xebvp5LiSV0iVa0iSv tPGWoVxNz8Vq/kRkFn4BXPdwPTLkgZ4up0HUL2WUDXPyXATypE1In/wQsHkvAji/sF+c Xf29FwsU6sHa1/NvxjBhUDS780CFqaIJthok0iTPoxw89iqARr5mHCxrwDZ2KfF14Qie xo13Ou7GHu1jIfo6h542G289Ri1LO1lh8zyCh5uRzglXvK4FEXyMigZbwhPdlPAXu1FA a1Vg== X-Gm-Message-State: ALyK8tKzEOYqFkbMWbumoCYD2Kv0zI92sczvATV6bl2+8benGTzKF57Tr+EGpQaEZNscIg== X-Received: by 10.129.56.196 with SMTP id f187mr5610492ywa.111.1464271461650; Thu, 26 May 2016 07:04:21 -0700 (PDT) Received: from [10.0.1.65] (66-214-229-209.dhcp.lnbh.ca.charter.com. [66.214.229.209]) by smtp.gmail.com with ESMTPSA id w125sm2435250ywd.55.2016.05.26.07.04.20 for (version=TLSv1/SSLv3 cipher=OTHER); Thu, 26 May 2016 07:04:20 -0700 (PDT) From: Paul Okstad Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (1.0) Subject: Re: The state of filtered replication Message-Id: <59AF1472-E7AF-48CC-93F5-ED401A1159EB@gmail.com> Date: Thu, 26 May 2016 07:04:19 -0700 References: <78765234-9DCE-4D77-8267-18342D494051@medicmobile.org> <8ED3123A-2470-4F56-987C-5E46EFBB6871@gmail.com> In-Reply-To: To: user@couchdb.apache.org X-Mailer: iPhone Mail (13F69) archived-at: Thu, 26 May 2016 14:04:35 -0000 I'll double check my situation since I have not thoroughly verified it. This= particular issue occurs between restarts of the server where I make no chan= ges to the continuous replications in the _replicator DB, but it may also be= related to the issue of too many continuous replications causing a replicat= ions to stall out from lack of resources. It's possible that I assumed they w= ere starting over from seq 1 when in fact they were never able to complete a= full replication in the first place. --=20 Paul Okstad > On May 26, 2016, at 2:51 AM, Robert Newson wrote: >=20 > There must be something else wrong. Filtered replications definitely make a= nd resume from checkpoints, same as unfiltered. >=20 > We mix the filter code and parameters into the replication checkpoint id t= o ensure we start from 0 for a potentially different filtering. Perhaps you a= re changing those? Or maybe supplying since_seq as well (which overrides the= checkpoint)? >=20 > Sent from my iPhone >=20 >> On 25 May 2016, at 16:39, Paul Okstad wrote: >>=20 >> This isn=E2=80=99t just a problem of filtered replication, it=E2=80=99s a= major issue in the database-per-user strategy (at least in the v1.6.1 I=E2=80= =99m using). I=E2=80=99m also using a database-per-user design with thousand= s of users and a single global database. If a small fraction of the users (h= undreds) has continuously ongoing replications from the user DB to the globa= l DB, it will cause extremely high CPU utilization. This is without any repl= ication filtered javascript function. >>=20 >> Another huge issue with filtered replications is that they lose their pla= ce when replications are restarted. In other words, they don=E2=80=99t keep t= rack of sequence ID between restarts of the server or stopping and starting t= he same replication. So for example, if I want to perform filtered replicati= on of public documents from the global DB to the public DB, and I have a ton= of documents in global, then each time I restart the filtered replication i= t will begin from sequence #1. I=E2=80=99m guessing this is due to the fact t= hat CouchDB does not know if the filter function has been modified between r= eplications, but this behavior is still very disappointing. >>=20 >> =E2=80=94=20 >> Paul Okstad >> http://pokstad.com >>=20 >>=20 >>=20 >>> On May 25, 2016, at 4:25 AM, Stefan Klein wrote:= >>>=20 >>> 2016-05-25 12:48 GMT+02:00 Stefan du Fresne : >>>=20 >>>=20 >>>=20 >>>> So to be clear, this is effectively replacing replication=E2=80=94 wher= e the >>>> client negotiates with the server for a collection of changes to downlo= ad=E2=80=94 >>>> with a daemon that builds up a collection of documents that each client= >>>> should get (and also presumably delete), which clients can then query f= or >>>> when they=E2=80=99re able? >>>=20 >>> Sorry, didn't describe well enough. >>>=20 >>> On Serverside we have one big database containing all documents and one d= b >>> for each user. >>> The clients always replicate to and from their individual userdb, >>> unfiltered. So the db for a user is a 1:1 copy of their pouchdb/... on >>> their client. >>>=20 >>> Initially we set up a filtered replication for each user from servers ma= in >>> database to the server copy of the users database. >>> With this we ran into performance problems and sooner or later we probab= ly >>> would have ran into issues with open file descriptors. >>>=20 >>> So what we do instead is listening to the changes of the main database a= nd >>> distribute the documents to the servers userdb, which then are synced wi= th >>> the clients. >>>=20 >>> Note: this is only for documents the users actually work with (as in >>> possibly modify), for queries on the data we query views on the main >>> database. >>>=20 >>> For the way back, we listen to the _dbchanges, so we get an event for >>> changes on the users dbs, get that change from the users db and determin= e >>> what to do with it. >>> We do not replicate back users changes to the main database but rather h= ave >>> an internal API to evaluate all kinds of constrains on users input. >>> If you do not have to check users input, you could certainly listen to >>> _dbchanges and "blindly" one-shot replicate from the changed DB to your >>> main DB. >>>=20 >>> --=20 >>> Stefan >=20