From user-return-19791-apmail-couchdb-user-archive=couchdb.apache.org@couchdb.apache.org Mon Feb 13 16:04:32 2012 Return-Path: X-Original-To: apmail-couchdb-user-archive@www.apache.org Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8EEDB9FAC for ; Mon, 13 Feb 2012 16:04:32 +0000 (UTC) Received: (qmail 46806 invoked by uid 500); 13 Feb 2012 16:04:31 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 46712 invoked by uid 500); 13 Feb 2012 16:04:30 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 46677 invoked by uid 99); 13 Feb 2012 16:04:29 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 13 Feb 2012 16:04:29 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of mike.iannacone@gmail.com designates 209.85.214.52 as permitted sender) Received: from [209.85.214.52] (HELO mail-bk0-f52.google.com) (209.85.214.52) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 13 Feb 2012 16:04:23 +0000 Received: by bkcji1 with SMTP id ji1so5413819bkc.11 for ; Mon, 13 Feb 2012 08:04:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; bh=EQmClsqfHPIeADae6b+iAGUyzV94rsghUAvUNe3UOgQ=; b=kr4WZJ4AO7iFaIUhZYaiNOl1l9TLgzkMXHxIFLxqo155iancgDQ7RUrFraF6lLHvMk 0LZqVJM5UrB9HvOe0TdKkn3plu+cseNpK7yny+ATu4g6lvjueaU/PzJ0IjHr2kYL9zql oN6WwbFiZVOppTfskhlyL1pNAk1HYxcEPa5kI= Received: by 10.204.129.199 with SMTP id p7mr7502814bks.119.1329149043308; Mon, 13 Feb 2012 08:04:03 -0800 (PST) MIME-Version: 1.0 Received: by 10.204.130.24 with HTTP; Mon, 13 Feb 2012 08:03:42 -0800 (PST) In-Reply-To: <4F391FDC.9020000@fnal.gov> References: <4F391FDC.9020000@fnal.gov> From: mike iannacone Date: Mon, 13 Feb 2012 11:03:42 -0500 Message-ID: Subject: Re: views fail with "OS Process timed out." To: Steve Foulkes Cc: user@couchdb.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org Thanks for the response. Looking around a bit more, it does seem like our documents are larger than most people are using. Is there any general guideline or rule of thumb as to how large documents should be? For some background, this is full of public health metrics and related data, which we're compiling from several different sources. Each document basically corresponds to one metric from one source. Many of these were imported from csv files, so mapping one csv file to one document made some sense for us. The documents each contain various metadata (the source, the years, possibly some statistical info, etc), and then a list of individual data objects. It might make sense to split this up, so that each document only contains the metadata, and has an attachment with the actual data. Does that sound like a good approach, or am I on the wrong track with that? Mike On Mon, Feb 13, 2012 at 9:36 AM, Steve Foulkes wrote: > Hi, > > > On 2/10/12 8:58 PM, mike iannacone wrote: >> >> Hi, I've been running into some rather strange errors when running my >> view code in certain cases. =A0It seems to run fine until the size of >> the database grows beyond a certain point, at which point I get >> timeouts. =A0The confusing part is that this size where it starts >> failing is quite low, around 1773 documents, totaling 402MB. >> >> environment: >> This is my development server, running couchDB 1.1.1, built using the >> build-couchdb tool as the wiki recommended, on a completely new Ubuntu >> install. =A0(I reinstalled it a few hours ago, thinking it might be some >> kind of environment problem.) >> >> overall process shown in the logs: >> >> *load a subset of documents, and confirm the views work >> >> *load most of the remaining documents, views work >> =A0(this was done from the futon client, running on another machine. >> It sees the connection time out, but view index builds ok anyway, and >> completes a few minutes after the client has given up. =A0When the >> client requests the view afterwards, it works fine, and fast now that >> the index is done.) >> >> *upload another 18 documents, (the largest ones, ranging from 10M to >> 22M,) view failed with "OS Process timed out." >> The log of everything described up to this point is included. > > > The large documents are the problem. =A0The view process is taking too lo= ng to > process them and is timing out. =A0You can increase the timeout in the > configuration which is accessible from futon, it's under "couchdb" and > called "os_process_timeout". > > Steve > >> >> This seems strange as it gave this error only now, when it took so >> long previously. =A0At any rate, I increased the os_process_timeout >> value to 10 minutes, and attempted it again, and it still timed out >> after only a few seconds. =A0(this is shown in the second log file, >> although it is essentially the same as the first.) >> >> >> the actual view functions are shown in the log, but for convenience they >> are: >> "indicator_summary": { >> =A0 =A0 =A0 =A0 =A0 =A0"map": "function(doc) {\n =A0if(doc.Data){\n =A0 = =A0var temp =3D >> {};\n =A0 =A0temp.Name =3D doc.Name;\n =A0 =A0temp.Description =3D >> doc.Description;\n =A0 =A0temp.Sources =3D doc.Sources;\n =A0 =A0temp.So= urceURL >> =3D doc.SourceURL;\n =A0 =A0temp.Years =3D doc.Years;\n =A0 =A0temp.Nati= onal =3D >> doc.National;\n =A0 =A0temp.LocaleLevels =3D doc.LocaleLevels;\n >> temp.Demographics =3D doc.Demographics;\n =A0 =A0temp.Unit =3D doc.Unit;= \n >> temp.UnitLabel =3D doc.UnitLabel;\n =A0 =A0temp.DataType =3D doc.DataTyp= e;\n >> =A0temp.Category =3D doc.Category;\n =A0 =A0temp.TopCorrelated =3D >> doc.TopCorrelated;\n =A0 =A0emit(doc.Name, temp);\n =A0}\n}" >> =A0 =A0 =A0 =A0}, >> =A0 =A0 =A0 =A0"indicator_detail": { >> =A0 =A0 =A0 =A0 =A0 =A0"map": "function(doc) {\n =A0if(doc.Data&& =A0doc= .Years){\n >> >> for(var i=3D0; i> j> temp.Name =3D doc.Name;\n =A0 =A0 =A0 =A0temp.Description =3D doc.Descri= ption;\n >> =A0 =A0 =A0temp.Sources =3D doc.Sources;\n =A0 =A0 =A0 =A0temp.SourceURL= =3D >> doc.SourceURL;\n =A0 =A0 =A0 =A0/*for(var k=3D0; k> =A0 =A0 =A0 =A0 if(doc.National[k][doc.Years[i]]){\n =A0 =A0 =A0 =A0 =A0= =A0temp.National >> =3D doc.National[k][doc.Years[i]];\n =A0 =A0 =A0 =A0 =A0}\n =A0 =A0 =A0 = =A0}*/\n >> temp.Demographics =3D doc.Demographics;\n =A0 =A0 =A0 =A0temp.Unit =3D d= oc.Unit;\n >> =A0 =A0 =A0 =A0temp.UnitLabel =3D doc.UnitLabel;\n =A0 =A0 =A0 =A0temp.D= ataType =3D >> doc.DataType;\n =A0 =A0 =A0 =A0temp.Category =3D doc.Category;\n >> temp.Data =3D doc.Data;\n =A0 =A0 =A0 =A0temp.TopCorrelated =3D >> doc.TopCorrelated;\n =A0 =A0 =A0 =A0emit([doc.Name, doc.Years[i]], temp)= ;\n >> =A0 }\n =A0 =A0}\n =A0}\n}" >> =A0 =A0 =A0 =A0} >> >> >> Besides this, I've tried replicated to a second machine, and on that >> one adjusting several values, with no real progress: increased erlang >> heartbeat timeout, increased erlang heap size, increased spidermonkey >> stack size. =A0These all either made no difference, or caused other >> errors. =A0I admit I was kind of guessing when changing those, so its >> entirely possible that I was completely on the wrong track with those. >> =A0At any rate, the logs I included (and the current state of that dev >> machine) is with everything set to its default values, except for that >> 10 minute os_process_timeout value I mentioned above. >> >> Any help would be fantastic, as I'm completely out of ideas at this >> point. =A0I'd of course be glad to provide any additional info that >> might be useful to you. >> >> Thanks! >> =A0 =A0 Mike > >