Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 38620 invoked from network); 4 Nov 2010 10:41:33 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 4 Nov 2010 10:41:33 -0000 Received: (qmail 57494 invoked by uid 500); 4 Nov 2010 10:42:03 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 57171 invoked by uid 500); 4 Nov 2010 10:42:00 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 57160 invoked by uid 99); 4 Nov 2010 10:41:59 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 04 Nov 2010 10:41:59 +0000 X-ASF-Spam-Status: No, hits=0.7 required=10.0 tests=FREEMAIL_FROM,SPF_NEUTRAL,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [93.17.128.84] (HELO smtp24.services.sfr.fr) (93.17.128.84) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 04 Nov 2010 10:41:50 +0000 Received: from filter.sfr.fr (localhost [127.0.0.1]) by msfrf2414.sfr.fr (SMTP Server) with ESMTP id 198DC700008A for ; Thu, 4 Nov 2010 11:41:30 +0100 (CET) Received: from [192.168.1.20] (254.53.103-84.rev.gaoland.net [84.103.53.254]) by msfrf2414.sfr.fr (SMTP Server) with ESMTP id 819837000088 for ; Thu, 4 Nov 2010 11:41:29 +0100 (CET) X-SFR-UUID: 20101104104129530.819837000088@msfrf2414.sfr.fr Message-ID: <4CD28DD8.4060908@free.fr> Date: Thu, 04 Nov 2010 11:41:28 +0100 From: cdr53x User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.15) Gecko/20101027 Thunderbird/3.0.10 MIME-Version: 1.0 To: user@couchdb.apache.org Subject: Re: How to speedup view generation? References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org On 10/30/2010 03:52 PM, Anand Chitipothu wrote: > I'm trying to setup a couchdb database with 14M documents. The view > generation is taking too long. It is running at the rate of 22 > docs/sec right now. At this rate it will take 7days to build the view, > which is too slow and I expect the speed to go down further as the > view file size increase. > > Hi , What is the size of the design document files on the drive ? I noticed that large views use quite large file ;). I also noticed that the view group indexers take a large amount time to achieve the last 30% of the task. At least twice then to complete the first 70%. In my case I have a 'small' database containing 400K docs. I also hava a design doc that indexes 80% of the docs with 8 views. Map functions only emit a single property per doc and a null value, so they should be compact. The overall size of this desing doc .view file on disk is 17G ;). I don't know how couchdb handles the update of such large files but maybe there is something with updating large files ... Concerning the performance, I use std javascript as interpreter and get a rate of ~60 changes/sec in the beginning of the process. Then it drops to 15c/s after 70%. I'm about 6c/s, then after 85% The first 70% took 52minutes and the whole process runned for 3h21m on a small stand alone dedicated server. So I get the feeling that it is not an issue with the view "calculation" algo, but probably something that is related to the disk i/o. I have no erlang knowlege, and I might be quite wrong about the feeling, but if you guys know a little bit on this part of couch code maybe there is something that could be checked and would improve the overall design doc refresh performance ? Regards, cdrx