couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Kocoloski <kocol...@apache.org>
Subject Re: replication using _changes API
Date Fri, 12 Jun 2009 12:59:42 GMT
Hi Damien, I'm not sure I follow.  My worry was that, if I built a  
replicator which only queried _changes to get the list of updates, I'd  
have to be prepared to process a very large response.  I thought one  
smart way to process this response was to throttle the download at the  
TCP level by putting the socket into passive mode.

I agree that the HTTP client seems to be at fault, because the option  
that it exposes to switch to passive mode seems to be a no-op.  What  
exactly did you mean by "streams the data while not buffering the  
data"?  Best,

Adam

On Jun 12, 2009, at 8:03 AM, Damien Katz wrote:

> I don't think this is TCPs fault, it's the HTTP client. We need a  
> HTTP client that streams data while not buffering the data (low  
> level TCP already buffers some), instead of sending all the data  
> that comes in to the waiting process, essentially buffering  
> everything.
>
> -Damien
>
>
> On Jun 11, 2009, at 4:14 PM, Adam Kocoloski wrote:
>
>> I had some time to work on a replicator that queries _changes  
>> instead of _all_docs_by_seq today.  The first question that came to  
>> my mind was how to put a spigot on the firehose.  If I call  
>> _changes without a "since" qs parameter on a 10M document DB I'm  
>> going to get 10M chunks of output back.
>>
>> I thought I might be able to control the flow at the TCP socket  
>> level using the inets HTTP client's {stream,{self,once}} option.  I  
>> still think this would be an elegant option if I can get it to  
>> work, but my early tests show that all the chunks still show up  
>> immediately in the calling process regardless of whether I stream  
>> to self or {self,once}.
>>
>> All for now, Adam
>


Mime
View raw message