From user-return-14031-apmail-couchdb-user-archive=couchdb.apache.org@couchdb.apache.org Wed Dec 01 20:11:34 2010 Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 97058 invoked from network); 1 Dec 2010 20:11:34 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 1 Dec 2010 20:11:34 -0000 Received: (qmail 35001 invoked by uid 500); 1 Dec 2010 20:11:32 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 34960 invoked by uid 500); 1 Dec 2010 20:11:32 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 34949 invoked by uid 99); 1 Dec 2010 20:11:32 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 01 Dec 2010 20:11:32 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=10.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jon@core-apps.com designates 209.85.160.180 as permitted sender) Received: from [209.85.160.180] (HELO mail-gy0-f180.google.com) (209.85.160.180) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 01 Dec 2010 20:11:22 +0000 Received: by gyd5 with SMTP id 5so3901788gyd.11 for ; Wed, 01 Dec 2010 12:11:01 -0800 (PST) Received: by 10.42.218.6 with SMTP id ho6mr2608711icb.399.1291234260499; Wed, 01 Dec 2010 12:11:00 -0800 (PST) MIME-Version: 1.0 Received: by 10.42.224.197 with HTTP; Wed, 1 Dec 2010 12:10:40 -0800 (PST) In-Reply-To: References: From: Jonathan Johnson Date: Wed, 1 Dec 2010 14:10:40 -0600 Message-ID: Subject: Re: Too many open files To: user@couchdb.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Thanks for the explanation :) Well, all of this is now making sense a little better now. With Paul's tip on what sort of requirements of open files I need, it's clear that now the time has come to refactor a bit of my code to make it more infrequent that we open up all of the databases. For now, I can change the init script to properly set the limits, and that will be a bandaid until I can properly test that refactoring :) Thanks for your help, everyone! I went off to lunch, and then thought to myself, "I wonder if I have any responses." Lo and behold, I did :) -Jon On Wed, Dec 1, 2010 at 2:00 PM, Robert Newson wro= te: > You aren't launching couchdb with anything that supports PAM. > > look in /etc/pam.d for a list of services that will honor limits.conf. > > On my system (Debian), /etc/pam.d/su does not honor limits.conf by > default. Even if you enable it, the couchdb startup script doesn't use > su anyway, so it still doesn't help. > > shorter version: PAM and limits.conf is for interactive users, not daemon= s. > > B. > > On Wed, Dec 1, 2010 at 7:55 PM, Jonathan Johnson wrot= e: >> Ah, you're absolutely right -- it didn't work. I'm still at 1024 >> files. Well, that answers part of the question. If all else fails, I >> could use your method by updating the init.d script a little. >> >> Does anyone have any ideas as to why the limits.conf doesn't work? I >> know my way around setting up a system, but this level of >> configuration is currently a little above my head :) >> >> -Jon >> >> >> On Wed, Dec 1, 2010 at 12:29 PM, Robert Newson = wrote: >>> look in /proc/>> doubt it does. >>> >>> The way I increase fd limits from the miserly Linux default of 1024 is >>> with this run script, where couchdb is launched by runit; >>> >>> #!/bin/bash >>> exec 2>&1 >>> export HOME=3D >>> ulimit -n 10000 >>> exec chpst -u couchdb -f >>> >>> B. >>> >>> >>> >>> >>> On Wed, Dec 1, 2010 at 6:21 PM, Jonathan Johnson wr= ote: >>>> Our couch setup has around 100 databases with a significant number of >>>> views in each database. Every once in a while, couch takes a dive. I >>>> happened to be around this time, and saw this in the logs: >>>> >>>> >>>> [Wed, 01 Dec 2010 18:09:19 GMT] [error] [<0.102.0>] {error_report,<0.3= 1.0>, >>>> =A0 =A0{<0.102.0>,std_error, >>>> =A0 =A0 {mochiweb_socket_server,225,{acceptor_error,{error,accept_fail= ed}}}}} >>>> >>>> [Wed, 01 Dec 2010 18:09:19 GMT] [error] [<0.10711.1125>] {error_report= ,<0.31.0>, >>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0{<0.10711.1125>,std_error, >>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 [{application,mochiweb}, >>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0"Accept failed error","{error,emfile}"]= }} >>>> >>>> [Wed, 01 Dec 2010 18:09:19 GMT] [error] [<0.10711.1125>] {error_report= ,<0.31.0>, >>>> =A0 =A0{<0.10711.1125>,crash_report, >>>> =A0 =A0 [[{initial_call,{mochiweb_socket_server,acceptor_loop,['Argume= nt__1']}}, >>>> =A0 =A0 =A0 {pid,<0.10711.1125>}, >>>> =A0 =A0 =A0 {registered_name,[]}, >>>> =A0 =A0 =A0 {error_info, >>>> =A0 =A0 =A0 =A0 =A0 {exit, >>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 {error,accept_failed}, >>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 [{mochiweb_socket_server,acceptor_loop,1}, >>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0{proc_lib,init_p_do_apply,3}]}}, >>>> =A0 =A0 =A0 {ancestors, >>>> =A0 =A0 =A0 =A0 =A0 [couch_httpd,couch_secondary_services,couch_server= _sup,<0.32.0>]}, >>>> =A0 =A0 =A0 {messages,[]}, >>>> =A0 =A0 =A0 {links,[<0.102.0>]}, >>>> =A0 =A0 =A0 {dictionary,[]}, >>>> =A0 =A0 =A0 {trap_exit,false}, >>>> =A0 =A0 =A0 {status,running}, >>>> =A0 =A0 =A0 {heap_size,233}, >>>> =A0 =A0 =A0 {stack_size,24}, >>>> =A0 =A0 =A0 {reductions,202}], >>>> =A0 =A0 =A0[]]}} >>>> >>>> [Wed, 01 Dec 2010 18:09:19 GMT] [error] [<0.102.0>] {error_report,<0.3= 1.0>, >>>> =A0 =A0{<0.102.0>,std_error, >>>> =A0 =A0 {mochiweb_socket_server,225,{acceptor_error,{error,accept_fail= ed}}}}} >>>> >>>> I had run into an open files limit before, and had adjusted a few >>>> settings. Here are some of the config values I think are relevant: >>>> >>>> max_dbs_open =3D 100 >>>> max_connections =3D 2048 >>>> >>>> From /etc/security/limits.conf >>>> couchdb =A0 =A0 =A0 =A0 hard =A0 =A0nofile =A04096 >>>> couchdb =A0 =A0 =A0 =A0 soft =A0 =A0nofile =A04096 >>>> >>>> The installed version is 1.0.1. >>>> >>>> I'm not sure how to debug this issue further. It only happens after >>>> several days of usage, and once it happens, I can't even ask for the >>>> stats page to see what the current numbers are :) >>>> >>>> Thanks in advance for any help! >>>> -Jon >>>> >>> >> >