couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wendall Cada (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (COUCHDB-1449) Couchdb returns stopped status before process exits
Date Wed, 28 Mar 2012 09:43:29 GMT

    [ https://issues.apache.org/jira/browse/COUCHDB-1449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13240323#comment-13240323
] 

Wendall Cada commented on COUCHDB-1449:
---------------------------------------

See: use-sname-rpc-not-kill.patch

Here is what I figured out while testing. The whole concept of using a PID file and kill -1
$PID with erlang is just not going to work consistently.

Here is a way to replicate what happens sometimes when issuing a restart (stop/start), and
beam hasn't stopped yet.

For example, try: couchdb -b && couchdb -d && couchdb -b
Apache CouchDB has started, time to relax.
Apache CouchDB is not running.
Apache CouchDB has started, time to relax.
$ echo `cat /var/run/couchdb/couchdb.pid`
10229
$ ps -A | grep beam.smp
10193 pts/2    00:00:00 beam.smp

However, adding -sname couchdb to the command options results the second start failing silently,
but couchdb does stop. A stale pid id is left in the pid file from the second start command.

Now if I modified start_couchdb so it actually checks if the process id returned from the
erl command is running, then wait 2 seconds so the pid file can hit the disk. I modified stop_couchdb
and eliminated the use of kill -1 and wait for the process to actually exit. Now everything
works as intended, no matter what bizarre scenario is encountered.

So for just pure stupid, I can do this: 
for i in {1..5} ; do couchdb -d; couchdb -b ; done
The last command is a start and sure enough, couchdb is running and has restarted completely
five times.
Same stupid in reverse:
for i in {1..5} ; do couchdb -b; couchdb -d ; done
CouchDB is stopped.

Now clearly there is going to be an issue with the use of sname and multiple couchdb instances
up and running, but I think it will be worthwhile to fix. Every single resource I read and
my own experience with erlang is that using kill to shut down is just waiting for problems.

I've temporarily appended the pid to start and stop messages for clarity on what's happening.







                
> Couchdb returns stopped status before process exits
> ---------------------------------------------------
>
>                 Key: COUCHDB-1449
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1449
>             Project: CouchDB
>          Issue Type: Bug
>    Affects Versions: 1.0.3, 1.1.1, 1.2, 1.3
>         Environment: *NIX
>            Reporter: Wendall Cada
>              Labels: patch
>             Fix For: 1.0.4, 1.2.1, 1.1.2
>
>         Attachments: couchdb-0007-wait-for-couch-stop.patch, couchdb-0007-wait-for-couch-stop.patch
>
>
> When restarting couchdb via init script, couchdb returns success status before the process
is exited. When a start is issued before the process ends, couchdb fails to start, but returns
success.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message