Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@couchdb.apache.org
MIME-Version: 1.0
In-Reply-To: 
 <CAHdjip+DYafeLePVBeFUHNbFfbThrWQLS8Xvp=Linoq8Een5yQ@mail.gmail.com>
References: 
 <CAHdjipKokz6h1DhF-1uGcxRisPRmrMM7yXSAEneUSmodizc3ag@mail.gmail.com>
	<CAJNb-9oKP02Vp7A6wFHdadnyLD_7aQYvGoHL=m5X5y4ntLu_pA@mail.gmail.com>
	<CAJNb-9rCJ79jh1wncTYOAUhuQW9XGER33tk5neFPS83v5RzdKA@mail.gmail.com>
	<CABvT1DFcPBcrRByS2zaf6JECorbNTGSDX49jT+KhgSSxvrbh2A@mail.gmail.com>
	<CAHdjip+DYafeLePVBeFUHNbFfbThrWQLS8Xvp=Linoq8Een5yQ@mail.gmail.com>
Date: Wed, 27 Nov 2013 12:46:16 +0000
Message-ID: 
 <CABvT1DEq_ckrnZV5hqpvaCEYrkkQmQaATV3f9x1_kLn3+Lr67Q@mail.gmail.com>
Subject: Re: NPM, CouchDB and big attachments
From: Robert Newson <rnewson@apache.org>
To: "dev@couchdb.apache.org" <dev@couchdb.apache.org>
Content-Type: text/plain; charset=ISO-8859-1

Alex,

That's basically the right approach but I'll say it over in my own
words with some background;

The way a document with attachments is replicated is as follows;

1) All the bytes of the attachments are written, the offsets of the
starts of each chunk of each attachment is remembered in memory.
2) The document is transferred and the chunk offsets are recorded
atomically with the document write.

A crash before the end of step 1 forces a full restart for that document.

We certainly cannot show a document through the HTTP interface in a
partially replicated state (all attachments and the updated document
body must appear atomically at the target). Instead, we could update
the database (not the document) with the offsets and the _id/_rev they
belong to, to allow resumption. We'd need to clean it up automatically
though. Something like the way we remember the last purge in the db
header. As you say, we could then use Range headers to fetch the parts
we're missing from the source.

B.

On 27 November 2013 12:26, Alexander Shorin <kxepal@gmail.com> wrote:
> On Wed, Nov 27, 2013 at 3:59 PM, Robert Newson <rnewson@apache.org> wrote:
>> Particularly, we could make
>> attachment replication resumable. Currently, if we replicate 99.9% of
>> a large attachment, lose our connection, and resume, we'll start over
>> from byte 0. This is why, elsewhere, there's a suggestion of 'one
>> attachment per document'. That is a horrible and artificial constraint
>> just to work around replicator deficiencies. We should encourage sane
>> design (related attachments together in the same document) and fix the
>> bugs that prevent heavy users from following it.
>
> I think the key issue there is in missing some semi-persistent buffer
> on the other side that could be used as temporary buffer for already
> received data. In this case replicator may use Range header to send
> only missed attachment chunks to Target (since doc and other bit are
> already there in the buffer). When every bit had been sent
> successfully, doc and his attachments moves from this buffer to the
> target database (or been deleted after some timeout). But this isn't a
> good solution, right?
>
> --
> ,,,^..^,,,