couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Antony Blakey <>
Subject Re: Attachment Replication Problem - Bug Found
Date Sat, 16 May 2009 04:59:22 GMT

On 16/05/2009, at 9:15 AM, Antony Blakey wrote:

> On 16/05/2009, at 8:27 AM, Chris Anderson wrote:
>> Thanks for reporting this. I'm not sure I can see the issue in the
>> last logfile you posted (I haven't gone through the diffs to see  
>> where
>> you added log statements...) It seems that the attachment size is not
>> an issue, its the fact that there are many many attachments on each
>> doc. This means it should be fairly easy to make a reproducible
>> JavaScript test case, that causes a never-finishing replication. Once
>> we have that, I'd be happy to run it and bang on the code till I get
>> it to pass.
> I've created a test case with many documents, but it doesn't cause  
> the problem, so it must be somewhat more subtle than it looks.  
> Specifically, it may have something to do with the replication state  
> to that point.

To deal with the problem of outstanding promises I set  
couch_util:should_flush(1) - that's a separate issue. The bug that  
causes my replication to hang seems to be in ibrowse. The problem is  
that ibrowse is returning 1 more byte of data than it should, and so  
the following code in couch_db is failing because the case where  
LenLeft - size(Bin) < 0 isn't being caught. This blocks replication.  
When I wget the offending resource I get the correct length. The  
problem is with the second attachment (Perceive.png) in 

   write_streamed_attachment(_Stream, _F, 0, SpAcc)
       {ok, SpAcc};
   write_streamed_attachment(Stream, F, LenLeft, nil) ->
       Bin = F(),
       {ok, StreamPointer} = couch_stream:write(Stream, Bin),
       write_streamed_attachment(Stream, F, LenLeft - size(Bin),  
   write_streamed_attachment(Stream, F, LenLeft, SpAcc) ->
       Bin = F(),
       {ok, _} = couch_stream:write(Stream, Bin),
       write_streamed_attachment(Stream, F, LenLeft - size(Bin), SpAcc).

To enable replication to continue, a temporary fix is to replace the  
first case with this:

   write_streamed_attachment(_Stream, _F, LenLeft, SpAcc) when 1 >  
       {ok, SpAcc};

although maybe a better option is to *add* this case:

   write_streamed_attachment(_Stream, _F, LenLeft, SpAcc) when 0 >  
       ?LOG_ERROR("write_streamed_attachment has written too much  
data", []),
       {ok, SpAcc};

and truncate the binary to the expected length. I'm not familiar with  
ibrowse in terms of debugging this problem further.

Antony Blakey
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

I contend that we are both atheists. I just believe in one fewer god  
than you do. When you understand why you dismiss all the other  
possible gods, you will understand why I dismiss yours.
   --Stephen F Roberts

View raw message