Return-Path: X-Original-To: apmail-couchdb-user-archive@www.apache.org Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id CA89B7067 for ; Wed, 12 Oct 2011 15:32:47 +0000 (UTC) Received: (qmail 23244 invoked by uid 500); 12 Oct 2011 15:32:46 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 23203 invoked by uid 500); 12 Oct 2011 15:32:46 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 23195 invoked by uid 99); 12 Oct 2011 15:32:46 -0000 Received: from minotaur.apache.org (HELO minotaur.apache.org) (140.211.11.9) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 12 Oct 2011 15:32:46 +0000 Received: from localhost (HELO mail-iy0-f180.google.com) (127.0.0.1) (smtp-auth username rnewson, mechanism plain) by minotaur.apache.org (qpsmtpd/0.29) with ESMTP; Wed, 12 Oct 2011 15:32:46 +0000 Received: by iakc1 with SMTP id c1so219359iak.11 for ; Wed, 12 Oct 2011 08:32:45 -0700 (PDT) MIME-Version: 1.0 Received: by 10.231.47.202 with SMTP id o10mr13027960ibf.2.1318433565391; Wed, 12 Oct 2011 08:32:45 -0700 (PDT) Received: by 10.231.65.79 with HTTP; Wed, 12 Oct 2011 08:32:45 -0700 (PDT) In-Reply-To: References: Date: Wed, 12 Oct 2011 16:32:45 +0100 Message-ID: Subject: Re: Performance issue with changes API From: Robert Newson To: user@couchdb.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable The 3.5M row response is not formed in memory. :) It's done line by line. that said, that's almost 2000 rows per second, which doesn't sound that bad to me. B. On 12 October 2011 16:26, Matt Goodall wrote: > On 12 October 2011 14:22, Arnaud Bailly wrote: >> Hello, >> We have started experimenting with CouchDb as our backend, being especia= lly >> interested with the changes API, and we ran into performances issues. >> We have a DB containing aournd 3.5M docs, each about 10K in size. Runnin= g >> the following query on the database : >> >> http://192.168.1.166:5984/infowarehouse/_changes?since=3D0 >> >> takes about 30minutes on a 4-core, Windows 7 box, which seems rather hig= h. >> >> Is this expected ? =A0Are there any bench available on this API ? > > I'm not too surprised - CouchDB is probably building a massive JSON > changes response containing 3.5M items ;-). Instead you should use the > since=3D and limit=3D args together to get the items i= n > sensibly-sized batches, ending when you see no more items in the > response. > > Alternatively, you might be able to use feed=3Dcontinuous with timeout=3D= 0 > to stream the changes as fast as possible. The timeout=3D0 arg is just > there to shutdown the changes feed as soon as you've seen everything. > My laptop takes about 50s to stream about 1M changes using this > technique (sending the output to /dev/null). > > - Matt >