couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Marca (JIRA)" <j...@apache.org>
Subject [jira] Created: (COUCHDB-575) CouchDB crashes and restarts when multiple databases are being compacted at once
Date Mon, 23 Nov 2009 23:16:39 GMT
CouchDB crashes and restarts when multiple databases are being compacted at once
--------------------------------------------------------------------------------

                 Key: COUCHDB-575
                 URL: https://issues.apache.org/jira/browse/COUCHDB-575
             Project: CouchDB
          Issue Type: Bug
    Affects Versions: 0.11
         Environment: {"couchdb":"Welcome","version":"0.11.0b1e2a54d1-git"}
Gentoo 10.1, rebuilt as of Nov 12, 64-bit, linux kernel version 2.6.30
Erlang built from source file otp_src_R13B02-1.tar.gz (Gentoo calls it 13.2.2)
            Reporter: James Marca
            Priority: Minor


When I run compaction on multiple databases at once, CouchDB will crash and restart.  

This happens on views and databases.  

When compacting DBs, I had definite crashes when compacting just two databases.  When compacting
views, I haven't yet seen a crash with just two views running, but have with at few as 5 views
being compacted.

My DBs and views are large, but not unreasonable.  The databases run around 23G each (after
compaction).  The views are similarly sized:

james@lysithia ~ $ ls -lrth /var/lib/couchdb/.d12_2007_02morehash_design/ 
total 74G
-rw-r--r-- 1 couchdb couchdb  25G 2009-11-18 11:29 172235a8f385d7dc0e0818e5d003aad2.view
-rw-r--r-- 1 couchdb couchdb  13G 2009-11-19 19:30 2657851bc558aef6b89d05361193fc5e.view
-rw-r--r-- 1 couchdb couchdb  44M 2009-11-23 14:18 172235a8f385d7dc0e0818e5d003aad2.compact.view
-rw-r--r-- 1 couchdb couchdb 3.9M 2009-11-23 14:18 2657851bc558aef6b89d05361193fc5e.compact.view
-rw-r--r-- 1 couchdb couchdb  37G 2009-11-23 14:34 433c5bd5313e5509b96b67a6eb3d1145.view
-rw-r--r-- 1 couchdb couchdb  57M 2009-11-23 14:43 433c5bd5313e5509b96b67a6eb3d1145.compact.view

[Mon, 23 Nov 2009 23:07:46 GMT] [debug] [<0.92.0>] Spawning new group server for view
group _design/summary in database d12_2007_06morehash.

[Mon, 23 Nov 2009 23:07:46 GMT] [debug] [<0.80.0>] New task status for d12_2007_05morehash/summary3:
Copied 20000 of 909614 Ids (2%)

[Mon, 23 Nov 2009 23:07:48 GMT] [info] [<0.479.0>] 127.0.0.1 - - 'POST' /d12_2007_06morehash/_compact/summary
202

[Mon, 23 Nov 2009 23:07:48 GMT] [info] [<0.520.0>] View index compaction starting for
d12_2007_06morehash _design/summary

[Mon, 23 Nov 2009 23:07:49 GMT] [debug] [<0.520.0>] Resetting group index "_design/summary"
in db d12_2007_06morehash

[Mon, 23 Nov 2009 23:07:49 GMT] [debug] [<0.80.0>] New task status for d12_2007_06morehash
_design/summary: Processed 0 of 23 changes (0%)

[Mon, 23 Nov 2009 23:07:49 GMT] [debug] [<0.80.0>] New task status for d12_2007_06morehash/summary:
Copied 0 of 901450 Ids (0%)

[Mon, 23 Nov 2009 23:07:49 GMT] [debug] [<0.80.0>] New task status for d12_2007_06morehash
_design/summary: Processed 12 of 23 changes (52%)

[Mon, 23 Nov 2009 23:07:50 GMT] [debug] [<0.80.0>] New task status for d12_2007_06morehash
_design/summary: Processed 18 of 23 changes (78%)

[Mon, 23 Nov 2009 23:07:50 GMT] [debug] [<0.80.0>] New task status for d12_2007_06morehash
_design/summary: Finishing.

[Mon, 23 Nov 2009 23:07:50 GMT] [debug] [<0.80.0>] New task status for d12_2007_05morehash/summary3:
Copied 30000 of 909614 Ids (3%)

[Mon, 23 Nov 2009 23:07:52 GMT] [debug] [<0.80.0>] New task status for d12_2007_05morehash/summary3:
Copied 40000 of 909614 Ids (4%)

[Mon, 23 Nov 2009 23:07:53 GMT] [debug] [<0.80.0>] New task status for d12_2007_02morehash/summary3:
Copied 110000 of 874175 Ids (12%)

[Mon, 23 Nov 2009 23:07:53 GMT] [debug] [<0.80.0>] New task status for d12_2007_05morehash/summary3:
Copied 50000 of 909614 Ids (5%)

[Mon, 23 Nov 2009 23:07:54 GMT] [info] [<0.520.0>] checkpointing view update at seq
1156739 for d12_2007_06morehash _design/summary

[Mon, 23 Nov 2009 23:08:05 GMT] [debug] [<0.80.0>] New task status for d12_2007_04morehash/summary3:
Copied 90000 of 983989 Ids (9%)

[Mon, 23 Nov 2009 23:08:05 GMT] [debug] [<0.80.0>] New task status for d12_2007_02morehash/summary2:
Copied 50000 of 874175 Ids (5%)

[Mon, 23 Nov 2009 23:08:05 GMT] [error] [<0.80.0>] ** Generic server couch_task_status
terminating 
** Last message in was {#Ref<0.0.3.46775>,1}
** When Server state == nil
** Reason for termination == 
** {function_clause,
       [{couch_task_status,handle_info,[{#Ref<0.0.3.46775>,1},nil]},
        {gen_server,handle_msg,5},
        {proc_lib,init_p_do_apply,3}]}

[Mon, 23 Nov 2009 23:08:05 GMT] [error] [<0.80.0>] {error_report,<0.29.0>,
    {<0.80.0>,crash_report,
     [[{initial_call,{couch_task_status,init,['Argument__1']}},
       {pid,<0.80.0>},
       {registered_name,couch_task_status},
       {error_info,
           {exit,
               {function_clause,
                   [{couch_task_status,handle_info,
                        [{#Ref<0.0.3.46775>,1},nil]},
                    {gen_server,handle_msg,5},
                    {proc_lib,init_p_do_apply,3}]},
               [{gen_server,terminate,6},{proc_lib,init_p_do_apply,3}]}},
       {ancestors,[couch_primary_services,couch_server_sup,<0.30.0>]},
       {messages,[]},
       {links,[<0.76.0>]},
       {dictionary,[]},
       {trap_exit,false},
       {status,running},
       {heap_size,1597},
       {stack_size,24},
       {reductions,3059}],
      []]}}

[Mon, 23 Nov 2009 23:08:05 GMT] [error] [<0.76.0>] {error_report,<0.29.0>,
    {<0.76.0>,supervisor_report,
     [{supervisor,{local,couch_primary_services}},
      {errorContext,child_terminated},
      {reason,
          {function_clause,
              [{couch_task_status,handle_info,[{#Ref<0.0.3.46775>,1},nil]},
               {gen_server,handle_msg,5},
               {proc_lib,init_p_do_apply,3}]}},
      {offender,
          [{pid,<0.80.0>},
           {name,couch_task_status},
           {mfa,{couch_task_status,start_link,[]}},
           {restart_type,permanent},
           {shutdown,brutal_kill},
           {child_type,worker}]}]}}

[Mon, 23 Nov 2009 23:08:05 GMT] [error] [<0.19.0>] {error_report,<0.7.0>,
              {<0.19.0>,std_error,
               "File operation error: eacces. Target: ./lib.beam. Function: get_file. Process:
code_server."}}

[Mon, 23 Nov 2009 23:08:05 GMT] [error] [<0.19.0>] {error_report,<0.7.0>,
              {<0.19.0>,std_error,
               "File operation error: eacces. Target: ./erl_internal.beam. Function: get_file.
Process: code_server."}}

[Mon, 23 Nov 2009 23:08:05 GMT] [error] [<0.553.0>] ** Generic server couch_task_status
terminating 
** Last message in was {'$gen_cast',
                           {update_status,<0.494.0>,
                               <<"Copied 60000 of 909614 Ids (6%)">>}}
** When Server state == nil
** Reason for termination == 
** {{badmatch,[]},
    [{couch_task_status,handle_cast,2},
     {gen_server,handle_msg,5},
     {proc_lib,init_p_do_apply,3}]}


[Mon, 23 Nov 2009 23:08:05 GMT] [error] [<0.553.0>] {error_report,<0.29.0>,
              {<0.553.0>,crash_report,
               [[{initial_call,{couch_task_status,init,['Argument__1']}},
                 {pid,<0.553.0>},
                 {registered_name,couch_task_status},
                 {error_info,{exit,{{badmatch,[]},
                                    [{couch_task_status,handle_cast,2},
                                     {gen_server,handle_msg,5},
                                     {proc_lib,init_p_do_apply,3}]},
                                   [{gen_server,terminate,6},
                                    {proc_lib,init_p_do_apply,3}]}},
                 {ancestors,[couch_primary_services,couch_server_sup,
                             <0.30.0>]},
                 {messages,[]},
                 {links,[<0.76.0>]},
                 {dictionary,[]},
                 {trap_exit,false},
                 {status,running},
                 {heap_size,377},
                 {stack_size,24},
                 {reductions,127}],
                []]}}

[Mon, 23 Nov 2009 23:08:05 GMT] [error] [<0.76.0>] {error_report,<0.29.0>,
              {<0.76.0>,supervisor_report,
               [{supervisor,{local,couch_primary_services}},
                {errorContext,child_terminated},
                {reason,{{badmatch,[]},
                         [{couch_task_status,handle_cast,2},
                          {gen_server,handle_msg,5},
                          {proc_lib,init_p_do_apply,3}]}},
                {offender,[{pid,<0.553.0>},
                           {name,couch_task_status},
                           {mfa,{couch_task_status,start_link,[]}},
                           {restart_type,permanent},
                           {shutdown,brutal_kill},
                           {child_type,worker}]}]}}

[Mon, 23 Nov 2009 23:08:07 GMT] [error] [<0.555.0>] ** Generic server couch_task_status
terminating 
** Last message in was {'$gen_cast',
                           {update_status,<0.458.0>,
                               <<"Copied 100000 of 983989 Ids (10%)">>}}
** When Server state == nil
** Reason for termination == 
** {{badmatch,[]},
    [{couch_task_status,handle_cast,2},
     {gen_server,handle_msg,5},
     {proc_lib,init_p_do_apply,3}]}

and so on.  I can post more but I don't know what I'm looking at or what is helpful





-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message