incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simon Metson <simonmet...@googlemail.com>
Subject Re: views fail with "OS Process timed out."
Date Tue, 14 Feb 2012 00:55:09 GMT
Hi,
If your documents contain many "rows" it's probably better to have each "row" as a separate
document and collate with views. If you use attachments you can't (currently) build an index
on the data in the attachment IIRC. You'll want to test with a subset of the data and get
some reasonable expectation on how it'll behave as the data grows before making a final decision.
Cheers
Simon


On Monday, 13 February 2012 at 16:03, mike iannacone wrote:

> Thanks for the response. Looking around a bit more, it does seem like
> our documents are larger than most people are using. Is there any
> general guideline or rule of thumb as to how large documents should
> be?
> 
> For some background, this is full of public health metrics and related
> data, which we're compiling from several different sources. Each
> document basically corresponds to one metric from one source. Many of
> these were imported from csv files, so mapping one csv file to one
> document made some sense for us. The documents each contain various
> metadata (the source, the years, possibly some statistical info, etc),
> and then a list of individual data objects. It might make sense to
> split this up, so that each document only contains the metadata, and
> has an attachment with the actual data. Does that sound like a good
> approach, or am I on the wrong track with that?
> 
> Mike
> 
> On Mon, Feb 13, 2012 at 9:36 AM, Steve Foulkes <sfoulkes@fnal.gov (mailto:sfoulkes@fnal.gov)>
wrote:
> > Hi,
> > 
> > 
> > On 2/10/12 8:58 PM, mike iannacone wrote:
> > > 
> > > Hi, I've been running into some rather strange errors when running my
> > > view code in certain cases.  It seems to run fine until the size of
> > > the database grows beyond a certain point, at which point I get
> > > timeouts.  The confusing part is that this size where it starts
> > > failing is quite low, around 1773 documents, totaling 402MB.
> > > 
> > > environment:
> > > This is my development server, running couchDB 1.1.1, built using the
> > > build-couchdb tool as the wiki recommended, on a completely new Ubuntu
> > > install.  (I reinstalled it a few hours ago, thinking it might be some
> > > kind of environment problem.)
> > > 
> > > overall process shown in the logs:
> > > 
> > > *load a subset of documents, and confirm the views work
> > > 
> > > *load most of the remaining documents, views work
> > >  (this was done from the futon client, running on another machine.
> > > It sees the connection time out, but view index builds ok anyway, and
> > > completes a few minutes after the client has given up.  When the
> > > client requests the view afterwards, it works fine, and fast now that
> > > the index is done.)
> > > 
> > > *upload another 18 documents, (the largest ones, ranging from 10M to
> > > 22M,) view failed with "OS Process timed out."
> > > The log of everything described up to this point is included.
> > > 
> > 
> > 
> > 
> > The large documents are the problem.  The view process is taking too long to
> > process them and is timing out.  You can increase the timeout in the
> > configuration which is accessible from futon, it's under "couchdb" and
> > called "os_process_timeout".
> > 
> > Steve
> > 
> > > 
> > > This seems strange as it gave this error only now, when it took so
> > > long previously.  At any rate, I increased the os_process_timeout
> > > value to 10 minutes, and attempted it again, and it still timed out
> > > after only a few seconds.  (this is shown in the second log file,
> > > although it is essentially the same as the first.)
> > > 
> > > 
> > > the actual view functions are shown in the log, but for convenience they
> > > are:
> > > "indicator_summary": {
> > >            "map": "function(doc) {\n  if(doc.Data){\n    var temp =
> > > {};\n    temp.Name = doc.Name;\n    temp.Description =
> > > doc.Description;\n    temp.Sources = doc.Sources;\n    temp.SourceURL
> > > = doc.SourceURL;\n    temp.Years = doc.Years;\n    temp.National =
> > > doc.National;\n    temp.LocaleLevels = doc.LocaleLevels;\n
> > > temp.Demographics = doc.Demographics;\n    temp.Unit = doc.Unit;\n
> > > temp.UnitLabel = doc.UnitLabel;\n    temp.DataType = doc.DataType;\n
> > >  temp.Category = doc.Category;\n    temp.TopCorrelated =
> > > doc.TopCorrelated;\n    emit(doc.Name, temp);\n  }\n}"
> > >        },
> > >        "indicator_detail": {
> > >            "map": "function(doc) {\n  if(doc.Data&&  doc.Years){\n
> > > 
> > > for(var i=0; i<doc.Years.length; i++){\n      for(var j=0;
> > > j<doc.LocaleLevels.length; j++){\n        var temp = {};\n
> > > temp.Name = doc.Name;\n        temp.Description = doc.Description;\n
> > >      temp.Sources = doc.Sources;\n        temp.SourceURL =
> > > doc.SourceURL;\n        /*for(var k=0; k<doc.National.length; k++){\n
> > >         if(doc.National[k][doc.Years[i]]){\n            temp.National
> > > = doc.National[k][doc.Years[i]];\n          }\n        }*/\n
> > > temp.Demographics = doc.Demographics;\n        temp.Unit = doc.Unit;\n
> > >        temp.UnitLabel = doc.UnitLabel;\n        temp.DataType =
> > > doc.DataType;\n        temp.Category = doc.Category;\n
> > > temp.Data = doc.Data;\n        temp.TopCorrelated =
> > > doc.TopCorrelated;\n        emit([doc.Name, doc.Years[i]], temp);\n
> > >   }\n    }\n  }\n}"
> > >        }
> > > 
> > > 
> > > Besides this, I've tried replicated to a second machine, and on that
> > > one adjusting several values, with no real progress: increased erlang
> > > heartbeat timeout, increased erlang heap size, increased spidermonkey
> > > stack size.  These all either made no difference, or caused other
> > > errors.  I admit I was kind of guessing when changing those, so its
> > > entirely possible that I was completely on the wrong track with those.
> > >  At any rate, the logs I included (and the current state of that dev
> > > machine) is with everything set to its default values, except for that
> > > 10 minute os_process_timeout value I mentioned above.
> > > 
> > > Any help would be fantastic, as I'm completely out of ideas at this
> > > point.  I'd of course be glad to provide any additional info that
> > > might be useful to you.
> > > 
> > > Thanks!
> > >     Mike
> > > 
> > 
> > 
> 
> 
> 



Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message