Return-Path: Delivered-To: apmail-couchdb-dev-archive@www.apache.org Received: (qmail 81979 invoked from network); 5 Mar 2010 22:01:51 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 5 Mar 2010 22:01:51 -0000 Received: (qmail 88831 invoked by uid 500); 5 Mar 2010 22:01:36 -0000 Delivered-To: apmail-couchdb-dev-archive@couchdb.apache.org Received: (qmail 88788 invoked by uid 500); 5 Mar 2010 22:01:36 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 88780 invoked by uid 99); 5 Mar 2010 22:01:35 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 05 Mar 2010 22:01:35 +0000 X-ASF-Spam-Status: No, hits=3.6 required=10.0 tests=FREEMAIL_FROM,FS_REPLICA,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jchris@gmail.com designates 74.125.83.180 as permitted sender) Received: from [74.125.83.180] (HELO mail-pv0-f180.google.com) (74.125.83.180) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 05 Mar 2010 22:01:30 +0000 Received: by pvc22 with SMTP id 22so1342499pvc.11 for ; Fri, 05 Mar 2010 14:01:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:in-reply-to :references:date:x-google-sender-auth:message-id:subject:from:to :content-type:content-transfer-encoding; bh=pBPAObgBQ/K2CelI/85dYNRJei1sbBMu/OZZeI+FIy4=; b=dqXuaF8x5OjXXSDiQuiR4VofrpI9PWsav6/O5SLGOi1V0iNrfZRqhZbZbrwJQcOqVv cfdTMgh4pttk5aHYzVYUaOkVb7bHDlqvszgy+unCfAY7uhYAU+KQumjWX+rgoFg48umd /Bnh2nmVPof1JIFLi5qwsra8M8W2vA+vIVKZ8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:content-type :content-transfer-encoding; b=dnNVfSj3uAy9DIl5IU2Z13sMPtVkKmTyQPIylQUsqm9vfJxml9rxRTQFYf5GWl7nNZ 6+YCaO1IFKg//WDvrmSKZklxoh0VkCTqeOKxRTSEAwQWyShpzH2K9faDKTJreYQvdcfZ mfkCQxdk8dia+5ELWFnFQocxyxlV1C1CgQkbk= MIME-Version: 1.0 Sender: jchris@gmail.com Received: by 10.142.120.5 with SMTP id s5mr1011002wfc.154.1267826470424; Fri, 05 Mar 2010 14:01:10 -0800 (PST) In-Reply-To: References: Date: Fri, 5 Mar 2010 14:01:10 -0800 X-Google-Sender-Auth: 70832108623a9708 Message-ID: Subject: Re: Preserving seq order through replication From: Chris Anderson To: dev@couchdb.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On Fri, Mar 5, 2010 at 11:27 AM, Adam Kocoloski wrote= : > On Mar 5, 2010, at 2:13 PM, Randall Leeds wrote: > >> I believe replication right now sorts the incoming documents by seq >> (comment says smth like 'just in case'), but then they are fetched >> with some amount of concurrency and inserted as they arrive. Adam, >> please chime in if I'm reading it wrong, as I think some of those >> comments are yours. > > Yep, that's basically it. > >> On Fri, Mar 5, 2010 at 09:10, Adam Kocoloski wrote= : >>> With that said, making replication preserve the update order is probabl= y not very difficult or expensive to do. =A0Best, >>> >>> Adam >> >> It could be a performance loss if couch_rep_writer had to buffer >> writes to preserve insertion order. Alternatively, to prevent the >> write queue from growing unboundedly if one document repeatedly fails, >> couch_rep_reader could wait for a chunk of contiguous documents before >> handing them over to the writer. > > If we were to do this, I'd implement it that 2nd way, where the reader on= ly hands over contiguous blocks to the writer, and doesn't "get too far ahe= ad of itself" I still think the whole notion is unnecessary complexity that creates guarantees we'd rather not have. But I'm not gonna say don't write it. It's just that if someone relies on this, we'll have to do extra work to explain to them why their code broke when they scaled up. _local_seq should be considered a smell (but sometimes a necessary one for realtime apps...) > >> Concurrently fetching documents rather than just pipelining them on >> one http connection might not seem beneficial at first glance, but a >> source which has disks that can service concurrent reads stands to >> benefit. When the source can't actually do this it's up to the >> OS/FS/Disk to order reads in a way that is as optimal as possible. > > +1 > > Adam --=20 Chris Anderson http://jchrisa.net http://couch.io