Return-Path: Delivered-To: apmail-couchdb-dev-archive@www.apache.org Received: (qmail 64551 invoked from network); 12 Jun 2009 13:00:06 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 12 Jun 2009 13:00:06 -0000 Received: (qmail 14330 invoked by uid 500); 12 Jun 2009 13:00:17 -0000 Delivered-To: apmail-couchdb-dev-archive@couchdb.apache.org Received: (qmail 14235 invoked by uid 500); 12 Jun 2009 13:00:17 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 14225 invoked by uid 99); 12 Jun 2009 13:00:17 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 12 Jun 2009 13:00:17 +0000 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=FS_REPLICA,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of adam.kocoloski@gmail.com designates 74.125.92.24 as permitted sender) Received: from [74.125.92.24] (HELO qw-out-2122.google.com) (74.125.92.24) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 12 Jun 2009 13:00:06 +0000 Received: by qw-out-2122.google.com with SMTP id 5so1150886qwd.29 for ; Fri, 12 Jun 2009 05:59:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:sender:message-id:from:to :in-reply-to:content-type:content-transfer-encoding:mime-version :subject:date:references:x-mailer; bh=9QSggFdVTZQXTpJS8g60x97Ne+o0hpA4Qc/qmV8iPfg=; b=K28qCnYC2SLnv6mYGP2C3S4vkztH/jAPkuyGnB/RLOM3lH4bBl/K5NjZf2zf7wLWiF ZxorZap8/7pgI6RAJ0vE3zxapzA6ujO47cYgqEMOi3a8zJaYgahSgeKPMdiy3df/lz30 mIUmqYcJ4WRVxuc0tHES++/+PkVOqD4pJ4AnA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=sender:message-id:from:to:in-reply-to:content-type :content-transfer-encoding:mime-version:subject:date:references :x-mailer; b=wNiOwwViSEN8QPcofJqP8OH/OeDBCjOolhmnq6VlouGw1JraIgTbpvCsS05bXlBn0Y 8XEOVip1u+FQXjUHAWEIpCNJaHjQ4TYkNaf2VDWlm7lyX88xjICwE2gHqiuVaA5IcOLP 2bJuPOR081+RF1kcaUpd0WmbzGMKKUlMYmaAQ= Received: by 10.224.89.18 with SMTP id c18mr4287784qam.370.1244811585617; Fri, 12 Jun 2009 05:59:45 -0700 (PDT) Received: from ?10.0.1.2? (c-66-31-20-188.hsd1.ma.comcast.net [66.31.20.188]) by mx.google.com with ESMTPS id 5sm1507986qwh.21.2009.06.12.05.59.44 (version=TLSv1/SSLv3 cipher=RC4-MD5); Fri, 12 Jun 2009 05:59:44 -0700 (PDT) Sender: Adam Kocoloski Message-Id: <5E34D6F6-51D1-43A3-96DA-8CAC00883A56@apache.org> From: Adam Kocoloski To: dev@couchdb.apache.org In-Reply-To: Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v935.3) Subject: Re: replication using _changes API Date: Fri, 12 Jun 2009 08:59:42 -0400 References: <56A7CF26-8B1D-4D98-A122-B5A77A55F337@apache.org> X-Mailer: Apple Mail (2.935.3) X-Virus-Checked: Checked by ClamAV on apache.org Hi Damien, I'm not sure I follow. My worry was that, if I built a replicator which only queried _changes to get the list of updates, I'd have to be prepared to process a very large response. I thought one smart way to process this response was to throttle the download at the TCP level by putting the socket into passive mode. I agree that the HTTP client seems to be at fault, because the option that it exposes to switch to passive mode seems to be a no-op. What exactly did you mean by "streams the data while not buffering the data"? Best, Adam On Jun 12, 2009, at 8:03 AM, Damien Katz wrote: > I don't think this is TCPs fault, it's the HTTP client. We need a > HTTP client that streams data while not buffering the data (low > level TCP already buffers some), instead of sending all the data > that comes in to the waiting process, essentially buffering > everything. > > -Damien > > > On Jun 11, 2009, at 4:14 PM, Adam Kocoloski wrote: > >> I had some time to work on a replicator that queries _changes >> instead of _all_docs_by_seq today. The first question that came to >> my mind was how to put a spigot on the firehose. If I call >> _changes without a "since" qs parameter on a 10M document DB I'm >> going to get 10M chunks of output back. >> >> I thought I might be able to control the flow at the TCP socket >> level using the inets HTTP client's {stream,{self,once}} option. I >> still think this would be an elegant option if I can get it to >> work, but my early tests show that all the chunks still show up >> immediately in the calling process regardless of whether I stream >> to self or {self,once}. >> >> All for now, Adam >