Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 99326 invoked from network); 5 Oct 2009 18:58:23 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 5 Oct 2009 18:58:23 -0000 Received: (qmail 70504 invoked by uid 500); 5 Oct 2009 18:58:22 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 70467 invoked by uid 500); 5 Oct 2009 18:58:22 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 70457 invoked by uid 99); 5 Oct 2009 18:58:22 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 05 Oct 2009 18:58:22 +0000 X-ASF-Spam-Status: No, hits=4.9 required=10.0 tests=HTML_MESSAGE,SPF_NEUTRAL,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [209.85.223.184] (HELO mail-iw0-f184.google.com) (209.85.223.184) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 05 Oct 2009 18:58:12 +0000 Received: by iwn14 with SMTP id 14so1784713iwn.13 for ; Mon, 05 Oct 2009 11:56:50 -0700 (PDT) MIME-Version: 1.0 Received: by 10.231.81.148 with SMTP id x20mr698985ibk.2.1254769010022; Mon, 05 Oct 2009 11:56:50 -0700 (PDT) In-Reply-To: <46aeb24f0910051048g3c3f273aod1cd6bd838a0d644@mail.gmail.com> References: <5E259690-AAE4-46AB-8346-3ACD10921500@freshout.us> <55047b710910050045rda81e37qb214558b22168645@mail.gmail.com> <293FA6BA-A2F9-439F-8F5C-96A52E640D57@freshout.us> <55047b710910051005h7192d0ber46c0b4136142041f@mail.gmail.com> <46aeb24f0910051022g40ab18b3y64be22dd8818066f@mail.gmail.com> <46aeb24f0910051048g3c3f273aod1cd6bd838a0d644@mail.gmail.com> Date: Tue, 6 Oct 2009 05:56:49 +1100 Message-ID: <55047b710910051156i4efcb060o7d9f84a05576355f@mail.gmail.com> Subject: Re: couchdb and monit From: Nicholas Orr To: user@couchdb.apache.org Content-Type: multipart/alternative; boundary=000e0cd56c321e17ee047534ad74 X-Virus-Checked: Checked by ClamAV on apache.org --000e0cd56c321e17ee047534ad74 Content-Type: text/plain; charset=ISO-8859-1 I've changed mine to do the -r 5and to send an alert if it is not running. as long as -r 5 does what it is suppose to do everything will be ok if it fails at least I'll know about is - this is where monit is useful, no matter how smart/capable an erlang app is "suppose" to be, I'd like to know if it goes down :) Nick On Tue, Oct 6, 2009 at 4:48 AM, Robert Newson wrote: > Understood. All I'm saying is that Erlang applications should already > have rich support for process restarting, heartbeat/keep-alive. > > monit is a generic wrapper to add those things when they are absent. A > correctly configured Erlang application shouldn't need monit, imo. > > B. > > On Mon, Oct 5, 2009 at 6:40 PM, Francisco Viramontes > wrote: > > I dunno but I tried with the respawn parameter for couchdb command in > Gentoo > > but it did not work. Also I have other services setup with monit so its > more > > convenient for me to have everything in one place. > > > > PAco > > On Oct 5, 2009, at 12:22 PM, Robert Newson wrote: > > > >> Isn't couchdb (at least in the Debian package) monitored by heart? > >> > >> B. > >> > >> On Mon, Oct 5, 2009 at 6:05 PM, Nicholas Orr > >> wrote: > >>> > >>> great! > >>> i was wondering what to put for the "test" conditions. > >>> Yours work well, so thanks to you as well ;) > >>> > >>> Nick > >>> > >>> On Tue, Oct 6, 2009 at 4:01 AM, Francisco Viramontes > >>> wrote: > >>> > >>>> Nicholas > >>>> > >>>> Thanks man it worked I had been banging on my head for a week because > of > >>>> this > >>>> > >>>> my final monit scipt is > >>>> > >>>> check process couchdb > >>>> with pidfile /var/run/couchdb/couchdb.pid > >>>> #start program = "/etc/init.d/couchdb start" > >>>> #stop program = "/etc/init.d/couchdb stop" > >>>> start program = "/usr/bin/sudo -u couchdb /usr/bin/couchdb -b -o > >>>> /dev/null > >>>> -e /dev/null -p /var/run/couchdb/couchdb.pid" > >>>> stop program = "/usr/bin/sudo -u couchdb /usr/bin/couchdb -b -o > >>>> /dev/null > >>>> -e /dev/null -p /var/run/couchdb/couchdb.pid -d" > >>>> if failed host 127.0.0.1 port 5984 then restart > >>>> if failed url http://localhost:5984/ and content == '"couchdb"' then > >>>> restart > >>>> group couchdb > >>>> > >>>> PAco > >>>> > >>>> > >>>> On Oct 5, 2009, at 2:45 AM, Nicholas Orr wrote: > >>>> > >>>> My monit script is verbatim, as monit is run as root I want couchdb > >>>>> > >>>>> run as couchdb so do the following > >>>>> > >>>>> check process couchdb with pidfile /var/run/couchdb/couchdb.pid > >>>>> start program = "/usr/bin/sudo -u couchdb /usr/bin/couchdb -b -o > >>>>> /dev/null -e /dev/null -p /var/run/couchdb/couchdb.pid" > >>>>> stop program = "/usr/bin/sudo -u couchdb /usr/bin/couchdb -b -o > >>>>> /dev/null -e /dev/null -p /var/run/couchdb/couchdb.pid -d" > >>>>> > >>>>> try that and see what happens... > >>>>> > >>>>> On Mon, Oct 5, 2009 at 7:49 AM, Francisco Viramontes < > paco@freshout.us> > >>>>> wrote: > >>>>> > >>>>>> Hey Guys > >>>>>> > >>>>>> has anyone tried to monitor couch with monit? > >>>>>> > >>>>>> I am using this settings and monit successfully monitors but when > >>>>>> couchdb > >>>>>> dies it fails to restart the service and I can find out why > >>>>>> > >>>>>> here is my couchdb.monitrc file: > >>>>>> > >>>>>> check process couchdb > >>>>>> with pidfile /var/run/couchdb/couchdb.pid > >>>>>> start program = "/etc/init.d/couchdb start" > >>>>>> stop program = "/etc/init.d/couchdb stop" > >>>>>> if failed host 127.0.0.1 port 5984 then restart > >>>>>> if failed url http://localhost:5984/ and content == '"couchdb"' > then > >>>>>> restart > >>>>>> group couchdb > >>>>>> > >>>>>> BTW I am using couch 0.9.1 and about once a day it dies on me the > only > >>>>>> thing > >>>>>> I get from the log are strange erlang error messages saying OS > procees > >>>>>> timeout, anyone know whats that about? > >>>>>> > >>>>>> PAco > >>>>>> > >>>>>> > >>>> > >>> > > > > > --000e0cd56c321e17ee047534ad74--