couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Anderson <jch...@apache.org>
Subject Re: Attachment Replication Problem
Date Sat, 16 May 2009 00:27:39 GMT
On Fri, May 15, 2009 at 5:16 PM, Antony Blakey <antony.blakey@gmail.com> wrote:
>
> On 15/05/2009, at 2:44 PM, Antony Blakey wrote:
>
>> I have a 3.5G Couchdb database, consisting of 1000 small documents, each
>> with many attachments (0-30 per document), each attachment varying wildly in
>> size (1K..10M).
>>
>> To test replication I am running a server on my MBPro and another under
>> Ubuntu in VMWare on the same machine. I'm testing using a pure trunk.
>>
>> Doing a pull-replicate from OSX to Linux fails to complete. The point at
>> which it fails is constant. I've added some debug logs into
>> couch_rep/attachment_loop like this: http://gist.github.com/112070 and made
>> the suggested "couch_util:should_flush(1000)" mod to try and guarantee
>> progress (but to no avail). The debug output shows this:
>> http://gist.github.com/112069 and the document it seems to fail on is this:
>> http://gist.github.com/112074 . I'm only just starting to look at this - any
>> pointers would be appreciated.
>
> I put some more logging in attachment_loop, specifically this:
>
>        {ibrowse_async_response, ReqId, Data} ->
>            ?LOG_DEBUG("ATTACHMENT_LOOP: ibrowse_async_response Data A ~p",
> [Url]),
>            receive {From, gimme_data} -> From ! {self(), Data} end,
>            ?LOG_DEBUG("ATTACHMENT_LOOP: ibrowse_async_response Data B ~p",
> [Url]),
>            attachment_loop(ReqId);
>
> The result of this is to see an enormous number of 'Data A' logs without a
> corresponding 'Data B'. This happens because make_attachment_stub_receiver
> uses a promise to read the data, created like this:
>
>        ResponseCode >= 200, ResponseCode < 300 ->
>            % the normal case
>            Pid ! {self(), continue},
>            %% this function goes into the streaming attachment code.
>            %% It gets executed by the replication gen_server, so it can't
>            %% be the one to actually receive the ibrowse data.
>            {ok, fun() ->
>                Pid ! {self(), gimme_data},
>                receive {Pid, Data} -> Data end
>            end};
>
> It seems that the promise is forced (e.g. the data read) only when the
> documents are checkpointed. If, as in my case, you have lots of small
> documents with many attachments, this results in massive numbers of open
> connections to download the attachments, each blocked reading the first bit
> of data from the first chunk, because the checkpointing occurs by default
> after 10MB of document data has been read, excluding attachments. In any
> case purely using size as a trigger won't work if you have lots of small
> documents with lots of small attachments. It would seem that the
> checkpointing, and hence forcing of the http-reading promises needs to also
> account for the number of promises.
>
> To overcome this problem I used couch_util:should_flush(1) to ensure that
> each document would be checkpointed, but that simply demonstrated that this
> isn't the cause of the 100% repeatable replication hang that I have. Now I
> get a log trace like this: http://gist.github.com/112512 (ignoring the crap
> at the end of each log statement, which is my incompleted attempt to link
> each log to the associated url).
>
> Anyone with any thoughts?
>

Thanks for reporting this. I'm not sure I can see the issue in the
last logfile you posted (I haven't gone through the diffs to see where
you added log statements...) It seems that the attachment size is not
an issue, its the fact that there are many many attachments on each
doc. This means it should be fairly easy to make a reproducible
JavaScript test case, that causes a never-finishing replication. Once
we have that, I'd be happy to run it and bang on the code till I get
it to pass.

I think the big problem is the architecture where attachments aren't
started streaming until the doc itself is written to disk. There's no
reason it should have to be this way, as we could setup a queue of
attachments (and docs that are waiting on them) and make it's width
configurable, beginning the attachment transfer right away. I've
written code like this a few times, and it should be totally doable in
this context.

If you create a JS test case that'll kick us into gear looking for the best fix.

-- 
Chris Anderson
http://jchrisa.net
http://couch.io

Mime
View raw message