Return-Path: X-Original-To: apmail-couchdb-user-archive@www.apache.org Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7AC35D8E9 for ; Wed, 26 Sep 2012 15:35:27 +0000 (UTC) Received: (qmail 13528 invoked by uid 500); 26 Sep 2012 15:35:25 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 13294 invoked by uid 500); 26 Sep 2012 15:35:24 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 13170 invoked by uid 99); 26 Sep 2012 15:35:24 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 26 Sep 2012 15:35:23 +0000 X-ASF-Spam-Status: No, hits=2.9 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_HELO_PASS,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [212.227.126.171] (HELO moutng.kundenserver.de) (212.227.126.171) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 26 Sep 2012 15:35:15 +0000 Received: from [192.168.2.12] (p50993050.dip0.t-ipconnect.de [80.153.48.80]) by mrelayeu.kundenserver.de (node=mrbap1) with ESMTP (Nemesis) id 0Lm6KH-1TpxPA3zuL-00Zepc; Wed, 26 Sep 2012 17:34:54 +0200 From: Frank Wunderlich Content-Type: multipart/alternative; boundary="Apple-Mail=_C0FFF504-CD81-430E-9C65-B0569681F8E1" Subject: Selective Replication Date: Wed, 26 Sep 2012 17:34:50 +0200 Message-Id: To: user@couchdb.apache.org Mime-Version: 1.0 (Apple Message framework v1280) X-Mailer: Apple Mail (2.1280) X-Provags-ID: V02:K0:glBSMNtUQDD+xBSRuXTOnSIP4wOb3MXLMHc6/pc79Da vw5POf83CgsRPDsJKvtWw9U4jKCnkd9WQpnqKmtJUmG8MXtDLB EzLCYf6y8pxnZcZOwFCxnXd0u82DwQa4jenTivJcf5rOD+tu2W hSohwcIoAva2mA8Mo7SOM9ZjzTNPNefM74aRbXAcdmD/TCZmiD 0Wb1gKAu2KTjCJXRAoaNnDj/9P5EoT1egMLVWxJBltK7uyWG2A wcBsZWeIZQ6U4PPaco1wMk6k8ADOinS3Un++yzfGlMn34A42Ho GHOdVqiHTOdDfg1cYNjEmm1422ONLnspLY/8C83Nw7xs00rKoE ogCAyw24ZxIIFYvVLEBSQGXOh8H1Jo65XpOz6RFJqj1WYivedg CtX6n3A4TTaiQ== --Apple-Mail=_C0FFF504-CD81-430E-9C65-B0569681F8E1 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=iso-8859-1 Hi *, I am currently trying to figure out, how one could realize something = like "selective replication" in CouchDB. In our scenario we have got around 10 physically distributed CouchDB = instances running. There will probably be more than 1 million documents in out "master" = instance. Only a subset of those documents shall be replicated to each of the = "slave" instances. Users shall be able to explicitly control, which documents shall be = synchronized to which destination. So far I stumbled over the following 2 concepts: 1. Filtered Replication 2. Named Document Replication On the first glance, replication filters seemed to be the way to go. But unfortunately we have got a quite "relational" document model. One logical "asset" consists of several CouchDB documents, referencing = each other. The filter functions can only access data, that is part of the document = that is passed in as parameter.=20 Because of this limitation, each partial document must contain all the = information necessary, to determine whether it shall be replicated or = not. This leads to redundancy and to potential inconsistencies if a = "transaction" fails. Inconsistent asset aggregates might get "partially" = transferred to other CouchDB instances. And in my eyes, it will be hard to recognize and track down the cause of = such inconsistencies. Furthermore our content documents get "polluted" by pure technical = attributes. That's why we took a look at the second option: Named Document = Replication. It seemed to be good idea, to separate the two concerns of persistence = and synchronization. First we would like to persist any "logical asset" in our local CouchDB. When we know that this step succeeded and all partial documents got = stored in the database, then we would "register" the "logical asset" for = synchronization. This step would happen on the application layer, that is built on top of = our CouchDB. The registration process would look up all partial documents that make = up the "logical asset". Then any running replication job would get canceled (assuming we are = using continues replication). Finally we would restart those replication jobs by adding the = indentified document_ids to the json that gets posted to the replicate = URL. The first attempts seemed promising. But when experimenting with larger sets of documents, we noticed a = significant performance degradation during replication. With 100.000 documents to be replicated, the "Named Document = Replication" was 4 times slower than the complete and unconditional = replication of the whole database. With 200.000 documents, the selective approach was even 7 times slower. With 1.000.000 documents, the factor was > 20 So this approach is not scaling well... What are your thoughts about this? Is there anyone who has faced similar architectural questions?=20 Any hint will be appreciated. Best regards, Frank -- kreuzwerker GmbH - we touch running systems fon +49 177 8780280 | fax +49 30 6098388-99=20 Ritterstra=DFe 12-14, 10969 Berlin | frank.wunderlich@kreuzwerker.de HR B 129427 | Amtsgericht Charlottenburg | Gesch=E4ftsf=FChrer: = Tilmann Eing =20 --Apple-Mail=_C0FFF504-CD81-430E-9C65-B0569681F8E1--