couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Couch User (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (COUCHDB-2545) couchdb fails intermittently on startup on fresh system (reproducible)
Date Fri, 16 Jan 2015 14:07:34 GMT

     [ https://issues.apache.org/jira/browse/COUCHDB-2545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Couch User updated COUCHDB-2545:
--------------------------------
    Description: 
using couchdb 1.5.0 on ubuntu 14.04.
Issue can be reasonably reliably reproduced with the following sequence.

create a custom override configuration as follows (couchdb.ini)
{code}
[couchdb]
database_dir = .
view_index_dir = .
uri_file = couchdb.uri

[httpd]
port = 5984
bind_address = 127.0.0.1

[log]
level = debug
file = couchdb.log
{code}

run this approximate sequence of shell in a loop as a non-root / unprivileged user. note will
require permission to /etc/couchdb/local.d
normally it will take less than 5 runs of the following script to fail.

{code}
#!/bin/sh
# -- start script --
# clean files from previous run (if any)
rm -f _replicator.couch _users.couch couchdb.log couchdb.pid couchdb.stderr couchdb.stdout
>/dev/null 2>&1 || true

# start couch in cwd (as per couchdb.ini)
couchdb -b -a couchdb.ini -p couchdb.pid

# wait for couchdb to produce & populate its pidfile
# usable implementation of wait_pidfile at the end of this issue
wait_pidfile couchdb.pid

# basic check that couchdb.pid is running
# this is where we'll fail because couchdb will frequently crash after writing the pidfile.
kill - 0 `cat couchdb.pid` || exit 1

# stop couchdb if it didn't fail this time.
couchdb -d -a couchdb.ini -p couchdb.ini

# --end script--
{code}

Here's the erlang backtrace I end up with when it does terminate.  If I was speculating I'd
guess that compaction was running on the database while it was still being initialized.  There
doesn't seem to be a real need to run compaction during initialization since the database
should in theory be empty anyway...

Further if I disable the compaction daemon through /etc/couchdb/default.ini by removing the
entry from [daemon] couchdb no longer fails.

{code}
[Thu, 15 Jan 2015 22:36:56 GMT] [error] [<0.151.0>] ** Generic server couch_compaction_daemon
terminating
** Last message in was {'EXIT',<0.152.0>,
                           {function_clause,
                               [{filename,join,
                                    [[]],
                                    [{file,"filename.erl"},{line,392}]},
                                {couch_server,all_databases,2,
                                    [{file,"couch_server.erl"},{line,203}]},
                                {couch_compaction_daemon,compact_loop,1,
                                    [{file,"couch_compaction_daemon.erl"},
                                     {line,101}]}]}}
** When Server state == {state,<0.152.0>}
** Reason for termination ==
** {compaction_loop_died,
       {function_clause,
           [{filename,join,[[]],[{file,"filename.erl"},{line,392}]},
            {couch_server,all_databases,2,
                [{file,"couch_server.erl"},{line,203}]},
            {couch_compaction_daemon,compact_loop,1,
                [{file,"couch_compaction_daemon.erl"},{line,101}]}]}}

[Thu, 15 Jan 2015 22:36:56 GMT] [error] [<0.151.0>] {error_report,<0.31.0>,
                     {<0.151.0>,crash_report,
                      [[{initial_call,
                         {couch_compaction_daemon,init,['Argument__1']}},
                        {pid,<0.151.0>},
                        {registered_name,couch_compaction_daemon},
                        {error_info,
                         {exit,
                          {compaction_loop_died,
                           {function_clause,
                            [{filename,join,
                              [[]],
                              [{file,"filename.erl"},{line,392}]},
                             {couch_server,all_databases,2,
                              [{file,"couch_server.erl"},{line,203}]},
                             {couch_compaction_daemon,compact_loop,1,
                              [{file,"couch_compaction_daemon.erl"},
                               {line,101}]}]}},
                          [{gen_server,terminate,6,
                            [{file,"gen_server.erl"},{line,744}]},
                           {proc_lib,init_p_do_apply,3,
                            [{file,"proc_lib.erl"},{line,239}]}]}},
                        {ancestors,
                         [couch_secondary_services,couch_server_sup,<0.32.0>]},
                        {messages,[]},
                        {links,[<0.94.0>]},
                        {dictionary,[]},
                        {trap_exit,true},
                        {status,running},
                        {heap_size,610},
                        {stack_size,27},
                        {reductions,513}],
                       []]}}
{code}
-- truncated, if the entire backtrace is required let me know I'll send the couchdb.log by
mail/other.

# note as a side, couchdb -s doesn't seem to pay attention to or care about -p so don't try
to use it.

{code}
wait_pidfile() {
        _interval=5
        while [ $_interval -gt 0 ]; do
                pid=$(cat "$1" 2>/dev/null)
                if [ -z $pid ]; then
                        _interval=`expr $_interval - 1`
                        sleep 1
                else
                        _interval=0
                fi
        done
        [ -n $pid ] && echo -n $pid || false
}
{code}

thanks for looking into this

  was:
using couchdb 1.5.0 on ubuntu 14.04.
Issue can be reasonably reliably reproduced with the following sequence.

create a custom override configuration as follows (couchdb.ini)
{code}
[couchdb]
database_dir = .
view_index_dir = .
uri_file = couchdb.uri

[httpd]
port = 5984
bind_address = 127.0.0.1

[log]
level = debug
file = couchdb.log
{code}
run this approximate sequence of shell in a loop as a non-root / unprivileged user. note will
require permission to /etc/couchdb/local.d
normally it will take less than 5 runs of the following script to fail.

#!/bin/sh
# -- start script --
# clean files from previous run (if any)
rm -f _replicator.couch _users.couch couchdb.log couchdb.pid couchdb.stderr couchdb.stdout
>/dev/null 2>&1 || true

# start couch in cwd (as per couchdb.ini)
couchdb -b -a couchdb.ini -p couchdb.pid

# wait for couchdb to produce & populate its pidfile
# usable implementation of wait_pidfile at the end of this issue
wait_pidfile couchdb.pid

# basic check that couchdb.pid is running
# this is where we'll fail because couchdb will frequently crash after writing the pidfile.
kill - 0 `cat couchdb.pid` || exit 1

# stop couchdb if it didn't fail this time.
couchdb -d -a couchdb.ini -p couchdb.ini

# --end script--

Here's the erlang backtrace I end up with when it does terminate.  If I was speculating I'd
guess that compaction was running on the database while it was still being initialized.  There
doesn't seem to be a real need to run compaction during initialization since the database
should in theory be empty anyway...

Further if I disable the compaction daemon through /etc/couchdb/default.ini by removing the
entry from [daemon] couchdb no longer fails.

[Thu, 15 Jan 2015 22:36:56 GMT] [error] [<0.151.0>] ** Generic server couch_compaction_daemon
terminating
** Last message in was {'EXIT',<0.152.0>,
                           {function_clause,
                               [{filename,join,
                                    [[]],
                                    [{file,"filename.erl"},{line,392}]},
                                {couch_server,all_databases,2,
                                    [{file,"couch_server.erl"},{line,203}]},
                                {couch_compaction_daemon,compact_loop,1,
                                    [{file,"couch_compaction_daemon.erl"},
                                     {line,101}]}]}}
** When Server state == {state,<0.152.0>}
** Reason for termination ==
** {compaction_loop_died,
       {function_clause,
           [{filename,join,[[]],[{file,"filename.erl"},{line,392}]},
            {couch_server,all_databases,2,
                [{file,"couch_server.erl"},{line,203}]},
            {couch_compaction_daemon,compact_loop,1,
                [{file,"couch_compaction_daemon.erl"},{line,101}]}]}}

[Thu, 15 Jan 2015 22:36:56 GMT] [error] [<0.151.0>] {error_report,<0.31.0>,
                     {<0.151.0>,crash_report,
                      [[{initial_call,
                         {couch_compaction_daemon,init,['Argument__1']}},
                        {pid,<0.151.0>},
                        {registered_name,couch_compaction_daemon},
                        {error_info,
                         {exit,
                          {compaction_loop_died,
                           {function_clause,
                            [{filename,join,
                              [[]],
                              [{file,"filename.erl"},{line,392}]},
                             {couch_server,all_databases,2,
                              [{file,"couch_server.erl"},{line,203}]},
                             {couch_compaction_daemon,compact_loop,1,
                              [{file,"couch_compaction_daemon.erl"},
                               {line,101}]}]}},
                          [{gen_server,terminate,6,
                            [{file,"gen_server.erl"},{line,744}]},
                           {proc_lib,init_p_do_apply,3,
                            [{file,"proc_lib.erl"},{line,239}]}]}},
                        {ancestors,
                         [couch_secondary_services,couch_server_sup,<0.32.0>]},
                        {messages,[]},
                        {links,[<0.94.0>]},
                        {dictionary,[]},
                        {trap_exit,true},
                        {status,running},
                        {heap_size,610},
                        {stack_size,27},
                        {reductions,513}],
                       []]}}

-- truncated, if the entire backtrace is required let me know I'll send the couchdb.log by
mail/other.

# note as a side, couchdb -s doesn't seem to pay attention to or care about -p so don't try
to use it.

wait_pidfile() {
        _interval=5
        while [ $_interval -gt 0 ]; do
                pid=$(cat "$1" 2>/dev/null)
                if [ -z $pid ]; then
                        _interval=`expr $_interval - 1`
                        sleep 1
                else
                        _interval=0
                fi
        done
        [ -n $pid ] && echo -n $pid || false
}

thanks for looking into this


> couchdb fails intermittently on startup on fresh system (reproducible) 
> -----------------------------------------------------------------------
>
>                 Key: COUCHDB-2545
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-2545
>             Project: CouchDB
>          Issue Type: Bug
>      Security Level: public(Regular issues) 
>            Reporter: Couch User
>
> using couchdb 1.5.0 on ubuntu 14.04.
> Issue can be reasonably reliably reproduced with the following sequence.
> create a custom override configuration as follows (couchdb.ini)
> {code}
> [couchdb]
> database_dir = .
> view_index_dir = .
> uri_file = couchdb.uri
> [httpd]
> port = 5984
> bind_address = 127.0.0.1
> [log]
> level = debug
> file = couchdb.log
> {code}
> run this approximate sequence of shell in a loop as a non-root / unprivileged user. note
will require permission to /etc/couchdb/local.d
> normally it will take less than 5 runs of the following script to fail.
> {code}
> #!/bin/sh
> # -- start script --
> # clean files from previous run (if any)
> rm -f _replicator.couch _users.couch couchdb.log couchdb.pid couchdb.stderr couchdb.stdout
>/dev/null 2>&1 || true
> # start couch in cwd (as per couchdb.ini)
> couchdb -b -a couchdb.ini -p couchdb.pid
> # wait for couchdb to produce & populate its pidfile
> # usable implementation of wait_pidfile at the end of this issue
> wait_pidfile couchdb.pid
> # basic check that couchdb.pid is running
> # this is where we'll fail because couchdb will frequently crash after writing the pidfile.
> kill - 0 `cat couchdb.pid` || exit 1
> # stop couchdb if it didn't fail this time.
> couchdb -d -a couchdb.ini -p couchdb.ini
> # --end script--
> {code}
> Here's the erlang backtrace I end up with when it does terminate.  If I was speculating
I'd guess that compaction was running on the database while it was still being initialized.
 There doesn't seem to be a real need to run compaction during initialization since the database
should in theory be empty anyway...
> Further if I disable the compaction daemon through /etc/couchdb/default.ini by removing
the entry from [daemon] couchdb no longer fails.
> {code}
> [Thu, 15 Jan 2015 22:36:56 GMT] [error] [<0.151.0>] ** Generic server couch_compaction_daemon
terminating
> ** Last message in was {'EXIT',<0.152.0>,
>                            {function_clause,
>                                [{filename,join,
>                                     [[]],
>                                     [{file,"filename.erl"},{line,392}]},
>                                 {couch_server,all_databases,2,
>                                     [{file,"couch_server.erl"},{line,203}]},
>                                 {couch_compaction_daemon,compact_loop,1,
>                                     [{file,"couch_compaction_daemon.erl"},
>                                      {line,101}]}]}}
> ** When Server state == {state,<0.152.0>}
> ** Reason for termination ==
> ** {compaction_loop_died,
>        {function_clause,
>            [{filename,join,[[]],[{file,"filename.erl"},{line,392}]},
>             {couch_server,all_databases,2,
>                 [{file,"couch_server.erl"},{line,203}]},
>             {couch_compaction_daemon,compact_loop,1,
>                 [{file,"couch_compaction_daemon.erl"},{line,101}]}]}}
> [Thu, 15 Jan 2015 22:36:56 GMT] [error] [<0.151.0>] {error_report,<0.31.0>,
>                      {<0.151.0>,crash_report,
>                       [[{initial_call,
>                          {couch_compaction_daemon,init,['Argument__1']}},
>                         {pid,<0.151.0>},
>                         {registered_name,couch_compaction_daemon},
>                         {error_info,
>                          {exit,
>                           {compaction_loop_died,
>                            {function_clause,
>                             [{filename,join,
>                               [[]],
>                               [{file,"filename.erl"},{line,392}]},
>                              {couch_server,all_databases,2,
>                               [{file,"couch_server.erl"},{line,203}]},
>                              {couch_compaction_daemon,compact_loop,1,
>                               [{file,"couch_compaction_daemon.erl"},
>                                {line,101}]}]}},
>                           [{gen_server,terminate,6,
>                             [{file,"gen_server.erl"},{line,744}]},
>                            {proc_lib,init_p_do_apply,3,
>                             [{file,"proc_lib.erl"},{line,239}]}]}},
>                         {ancestors,
>                          [couch_secondary_services,couch_server_sup,<0.32.0>]},
>                         {messages,[]},
>                         {links,[<0.94.0>]},
>                         {dictionary,[]},
>                         {trap_exit,true},
>                         {status,running},
>                         {heap_size,610},
>                         {stack_size,27},
>                         {reductions,513}],
>                        []]}}
> {code}
> -- truncated, if the entire backtrace is required let me know I'll send the couchdb.log
by mail/other.
> # note as a side, couchdb -s doesn't seem to pay attention to or care about -p so don't
try to use it.
> {code}
> wait_pidfile() {
>         _interval=5
>         while [ $_interval -gt 0 ]; do
>                 pid=$(cat "$1" 2>/dev/null)
>                 if [ -z $pid ]; then
>                         _interval=`expr $_interval - 1`
>                         sleep 1
>                 else
>                         _interval=0
>                 fi
>         done
>         [ -n $pid ] && echo -n $pid || false
> }
> {code}
> thanks for looking into this



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message