Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 86912 invoked from network); 19 Dec 2010 01:28:40 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 19 Dec 2010 01:28:40 -0000 Received: (qmail 93423 invoked by uid 500); 19 Dec 2010 01:28:38 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 93361 invoked by uid 500); 19 Dec 2010 01:28:38 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 93353 invoked by uid 99); 19 Dec 2010 01:28:38 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 19 Dec 2010 01:28:38 +0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=FREEMAIL_FROM,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of paul.joseph.davis@gmail.com designates 209.85.210.180 as permitted sender) Received: from [209.85.210.180] (HELO mail-iy0-f180.google.com) (209.85.210.180) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 19 Dec 2010 01:28:30 +0000 Received: by iyi12 with SMTP id 12so1421582iyi.11 for ; Sat, 18 Dec 2010 17:28:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:mime-version:received:in-reply-to :references:from:date:message-id:subject:to:content-type; bh=M5/Og/+P16naXgDYGuH/lc1w/W2/cSDOlhMi0LnXHVc=; b=xqTS8zJAJmNfP+gEiYw74CmbFRuufDpNEcBWhkvo3G5rUVz6b4fQ/7dqFWd7liUyVY DFCDMGri0LYOw7JJgsOQpXWMS2mQ26/Cid09rq2uqgX81dpW3N9lTH5z8GuHTSpMwafI ZOX0gM2ZN3edjRfMki2GXgBNpFhc6BWCaj2Vc= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; b=TvUtxtJwrwkp52VfbYi3C7sGLDDzTa0gezvnnBN5pyrwxeTq1xrQ8exri7CfOD6hXg umzQZaoWb1agg8RofXjLkKA++lcF9bWHMU5snTIT5kqzPb1HEdg2nqnNUFS4GuApUSFQ fuPp2bdPGZIJFv7TOLi0Nf2XNqss6M98nYrwI= Received: by 10.231.13.197 with SMTP id d5mr2445796iba.6.1292722089234; Sat, 18 Dec 2010 17:28:09 -0800 (PST) MIME-Version: 1.0 Received: by 10.231.30.72 with HTTP; Sat, 18 Dec 2010 17:27:29 -0800 (PST) In-Reply-To: References: From: Paul Davis Date: Sat, 18 Dec 2010 20:27:29 -0500 Message-ID: Subject: Re: How fast do CouchDB propagate changes to other nodes? To: user@couchdb.apache.org Content-Type: text/plain; charset=ISO-8859-1 X-Virus-Checked: Checked by ClamAV on apache.org On Sat, Dec 18, 2010 at 8:23 PM, Randall Leeds wrote: > On Sat, Dec 18, 2010 at 16:41, Paul Davis wrote: >> Right, I probably jumped a couple steps there: >> >> The unique datums we have to work with here are the _id/_rev pairs. >> Theoretically (if we ignore rev_stemming) the ordering with which >> these come to us is roughly unimportant. >> >> So the issue with our history (append only) is that there's no real >> way to order it such that we can efficiently seek through it to see >> what we have in common (that I can think of). Ie, replication still >> needs a way to say "I only need to send these bits". Right now its the >> src/dst/seq triple that lets us zip through only new edits. >> >> Well, theoretically, we could keep a merkle tree of all edits we've >> ever seen and go that way, but that'd require keeping a history of >> every edit ever seen which could never be removed. >> >> Granted this is just quick thinking. I could definitely be missing >> something clever. >> > > We're on the same page. I don't have anything clever yet either. > The only other thing that's crossed my mind is some way to exchange > information about checkpoints each participant has with a third party. > You'd have to somehow verify that the checkpoint being presented to > you is actually one created by the third party, which involves trust > or verification. I like the verification route because I'd still love > to decouple the endpoint from its hostname, the idea that I was > stabbing quite horribly at when I prematurely proposed a couple > patches to give databases uuids. But back to the point, something like > "you got a bunch of edits since last we spoke, but I got these edits > from this other endpoint, are they the same ones?" Even then, I'm not > sure how this works without the merkle. > Your last bit there was exactly my idea about the bloom filter. Just populate the filter with hashes of the uniquely identifying bits of an edit and then send the filter around.