Return-Path: X-Original-To: apmail-couchdb-dev-archive@www.apache.org Delivered-To: apmail-couchdb-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 67095104D4 for ; Wed, 27 Nov 2013 12:46:20 +0000 (UTC) Received: (qmail 37330 invoked by uid 500); 27 Nov 2013 12:46:20 -0000 Delivered-To: apmail-couchdb-dev-archive@couchdb.apache.org Received: (qmail 36883 invoked by uid 500); 27 Nov 2013 12:46:19 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 36875 invoked by uid 99); 27 Nov 2013 12:46:18 -0000 Received: from minotaur.apache.org (HELO minotaur.apache.org) (140.211.11.9) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 27 Nov 2013 12:46:18 +0000 Received: from localhost (HELO mail-la0-f54.google.com) (127.0.0.1) (smtp-auth username rnewson, mechanism plain) by minotaur.apache.org (qpsmtpd/0.29) with ESMTP; Wed, 27 Nov 2013 12:46:18 +0000 Received: by mail-la0-f54.google.com with SMTP id hp15so3413515lab.27 for ; Wed, 27 Nov 2013 04:46:16 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=Fp/vvNqAk43amMJDcJR5QD5CDbBs7goDeP62Pzb1Zoc=; b=HbqgyR+F+2FM+J+qND6NFv+MGKzNigmg4nFJdpEJZwMT3HIXdjTnPdFoEOik4rI7a5 jgfpoKuntEU28m2IFTHuteYOccJGbFwsuSt0NLG53YsfSj6skqbZbHq1SS6nbnE6YYCL tKN9mc7iuwrMEsfyc1vqr7WQBMxtlYTbA0w9l3pWPepRAKIu9LZr26SM+lz0uuomjH8u oQXRD10Kcx5YHFaCmAlDipTrp7IiGUQB23LNYjvu7q9g74MHCq0X7GpHyhZMqqmtF2Xu lYO5Mk5kB+xb45QWRWlRF5yVRLDEbHOXJ9TSR+RJX/3k/G5dDnkcv7RM7PsGxhWIzPqF +4MA== MIME-Version: 1.0 X-Received: by 10.112.171.228 with SMTP id ax4mr28165919lbc.6.1385556376572; Wed, 27 Nov 2013 04:46:16 -0800 (PST) Received: by 10.112.67.12 with HTTP; Wed, 27 Nov 2013 04:46:16 -0800 (PST) In-Reply-To: References: Date: Wed, 27 Nov 2013 12:46:16 +0000 Message-ID: Subject: Re: NPM, CouchDB and big attachments From: Robert Newson To: "dev@couchdb.apache.org" Content-Type: text/plain; charset=ISO-8859-1 Alex, That's basically the right approach but I'll say it over in my own words with some background; The way a document with attachments is replicated is as follows; 1) All the bytes of the attachments are written, the offsets of the starts of each chunk of each attachment is remembered in memory. 2) The document is transferred and the chunk offsets are recorded atomically with the document write. A crash before the end of step 1 forces a full restart for that document. We certainly cannot show a document through the HTTP interface in a partially replicated state (all attachments and the updated document body must appear atomically at the target). Instead, we could update the database (not the document) with the offsets and the _id/_rev they belong to, to allow resumption. We'd need to clean it up automatically though. Something like the way we remember the last purge in the db header. As you say, we could then use Range headers to fetch the parts we're missing from the source. B. On 27 November 2013 12:26, Alexander Shorin wrote: > On Wed, Nov 27, 2013 at 3:59 PM, Robert Newson wrote: >> Particularly, we could make >> attachment replication resumable. Currently, if we replicate 99.9% of >> a large attachment, lose our connection, and resume, we'll start over >> from byte 0. This is why, elsewhere, there's a suggestion of 'one >> attachment per document'. That is a horrible and artificial constraint >> just to work around replicator deficiencies. We should encourage sane >> design (related attachments together in the same document) and fix the >> bugs that prevent heavy users from following it. > > I think the key issue there is in missing some semi-persistent buffer > on the other side that could be used as temporary buffer for already > received data. In this case replicator may use Range header to send > only missed attachment chunks to Target (since doc and other bit are > already there in the buffer). When every bit had been sent > successfully, doc and his attachments moves from this buffer to the > target database (or been deleted after some timeout). But this isn't a > good solution, right? > > -- > ,,,^..^,,,