incubator-couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Newson (JIRA)" <j...@apache.org>
Subject [jira] Commented: (COUCHDB-597) Replication tasks crash.
Date Tue, 15 Dec 2009 14:37:18 GMT

    [ https://issues.apache.org/jira/browse/COUCHDB-597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12790760#action_12790760
] 

Robert Newson commented on COUCHDB-597:
---------------------------------------

Replication tasks are failing even if executed serially as long as databases are large enough
(1.3 gb in this case). The fourth replication task has crashed.

Stack traces from the end of my log while a replication tasks is hung/crashed;

Tue, 15 Dec 2009 07:08:44 GMT] [error] [<0.49.0>] ** Generic server couch_task_status
terminating 
** Last message in was {#Ref<0.0.1832.61391>,3}
** When Server state == nil
** Reason for termination == 
** {function_clause,
       [{couch_task_status,handle_info,[{#Ref<0.0.1832.61391>,3},nil]},
        {gen_server,handle_msg,5},
        {proc_lib,init_p_do_apply,3}]}

Tue, 15 Dec 2009 07:08:44 GMT] [error] [<0.45.0>] {error_report,<0.23.0>,
    {<0.45.0>,supervisor_report,
     [{supervisor,{local,couch_primary_services}},
      {errorContext,child_terminated},
      {reason,
          {function_clause,
              [{couch_task_status,handle_info,[{#Ref<0.0.1832.61391>,3},nil]},
               {gen_server,handle_msg,5},
               {proc_lib,init_p_do_apply,3}]}},
      {offender,
          [{pid,<0.49.0>},
           {name,couch_task_status},
           {mfa,{couch_task_status,start_link,[]}},
           {restart_type,permanent},
           {shutdown,brutal_kill},
           {child_type,worker}]}]}}

[Tue, 15 Dec 2009 07:08:51 GMT] [error] [<0.2720.204>] {error_report,<0.23.0>,
              {<0.2720.204>,crash_report,
               [[{initial_call,{couch_task_status,init,['Argument__1']}},
                 {pid,<0.2720.204>},
                 {registered_name,couch_task_status},
                 {error_info,{exit,{{badmatch,[]},
                                    [{couch_task_status,handle_cast,2},
                                     {gen_server,handle_msg,5},
                                     {proc_lib,init_p_do_apply,3}]},
                                   [{gen_server,terminate,6},
                                    {proc_lib,init_p_do_apply,3}]}},
                 {ancestors,[couch_primary_services,couch_server_sup,<0.1.0>]},
                 {messages,[]},
                 {links,[<0.45.0>]},
                 {dictionary,[]},
                 {trap_exit,false},
                 {status,running},
                 {heap_size,377},
                 {stack_size,24},
                 {reductions,127}],
                []]}}

[Tue, 15 Dec 2009 07:08:51 GMT] [error] [<0.45.0>] {error_report,<0.23.0>,
              {<0.45.0>,supervisor_report,
               [{supervisor,{local,couch_primary_services}},
                {errorContext,child_terminated},
                {reason,{{badmatch,[]},
                         [{couch_task_status,handle_cast,2},
                          {gen_server,handle_msg,5},
                          {proc_lib,init_p_do_apply,3}]}},
                {offender,[{pid,<0.2720.204>},
                           {name,couch_task_status},
                           {mfa,{couch_task_status,start_link,[]}},
                           {restart_type,permanent},
                           {shutdown,brutal_kill},
                           {child_type,worker}]}]}}

[Tue, 15 Dec 2009 07:08:57 GMT] [error] [<0.4889.204>] ** Generic server couch_task_status
terminating 
** Last message in was {'$gen_cast',
                           {update_status,<0.9558.169>,
                               <<"Copied 146001 of 271595 changes (53%)">>}}
** When Server state == nil
** Reason for termination == 
** {{badmatch,[]},
    [{couch_task_status,handle_cast,2},
     {gen_server,handle_msg,5},
     {proc_lib,init_p_do_apply,3}]}


[Tue, 15 Dec 2009 07:08:57 GMT] [error] [<0.4889.204>] {error_report,<0.23.0>,
              {<0.4889.204>,crash_report,
               [[{initial_call,{couch_task_status,init,['Argument__1']}},
                 {pid,<0.4889.204>},
                 {registered_name,couch_task_status},
                 {error_info,{exit,{{badmatch,[]},
                                    [{couch_task_status,handle_cast,2},
                                     {gen_server,handle_msg,5},
                                     {proc_lib,init_p_do_apply,3}]},
                                   [{gen_server,terminate,6},
                                    {proc_lib,init_p_do_apply,3}]}},
                 {ancestors,[couch_primary_services,couch_server_sup,<0.1.0>]},
                 {messages,[]},
                 {links,[<0.45.0>]},
                 {dictionary,[]},
                 {trap_exit,false},
                 {status,running},
                 {heap_size,377},
                 {stack_size,24},
                 {reductions,127}],
                []]}}

[Tue, 15 Dec 2009 07:08:57 GMT] [error] [<0.45.0>] {error_report,<0.23.0>,
              {<0.45.0>,supervisor_report,
               [{supervisor,{local,couch_primary_services}},
                {errorContext,child_terminated},
                {reason,{{badmatch,[]},
                         [{couch_task_status,handle_cast,2},
                          {gen_server,handle_msg,5},
                          {proc_lib,init_p_do_apply,3}]}},
                {offender,[{pid,<0.4889.204>},
                           {name,couch_task_status},
                           {mfa,{couch_task_status,start_link,[]}},
                           {restart_type,permanent},
                           {shutdown,brutal_kill},
                           {child_type,worker}]}]}}

[Tue, 15 Dec 2009 07:09:02 GMT] [error] [<0.45.0>] {error_report,<0.23.0>,
              {<0.45.0>,supervisor_report,
               [{supervisor,{local,couch_primary_services}},
                {errorContext,shutdown},
                {reason,reached_max_restart_intensity},
                {offender,[{pid,<0.6117.204>},
                           {name,couch_task_status},
                           {mfa,{couch_task_status,start_link,[]}},
                           {restart_type,permanent},
                           {shutdown,brutal_kill},
                           {child_type,worker}]}]}}

[Tue, 15 Dec 2009 07:09:02 GMT] [error] [<0.60.0>] Exit on non-updater process: killed

[Tue, 15 Dec 2009 07:09:02 GMT] [error] [<0.60.0>] ** Generic server couch_view terminating

** Last message in was {'EXIT',<0.61.0>,killed}
** When Server state == {server,"/var/lib/couchdb/0.10.0"}
** Reason for termination == 
** killed


[Tue, 15 Dec 2009 07:09:02 GMT] [error] [<0.60.0>] {error_report,<0.23.0>,
              {<0.60.0>,crash_report,
               [[{initial_call,{couch_view,init,['Argument__1']}},
                 {pid,<0.60.0>},
                 {registered_name,couch_view},
                 {error_info,{exit,killed,
                                   [{gen_server,terminate,6},
                                    {proc_lib,init_p_do_apply,3}]}},
                 {ancestors,[couch_secondary_services,couch_server_sup,
                             <0.1.0>]},
                 {messages,[]},
                 {links,[<0.52.0>]},
                 {dictionary,[]},
                 {trap_exit,true},
                 {status,running},
                 {heap_size,2584},
                 {stack_size,24},
                 {reductions,5320}],
                []]}}

[Tue, 15 Dec 2009 07:09:02 GMT] [error] [<0.52.0>] {error_report,<0.23.0>,
              {<0.52.0>,supervisor_report,
               [{supervisor,{local,couch_secondary_services}},
                {errorContext,child_terminated},
                {reason,killed},
                {offender,[{pid,<0.60.0>},
                           {name,view_manager},
                           {mfa,{couch_view,start_link,[]}},
                           {restart_type,permanent},
                           {shutdown,brutal_kill},
                           {child_type,worker}]}]}}

[Tue, 15 Dec 2009 07:08:44 GMT] [error] [<0.49.0>] {error_report,<0.23.0>,
    {<0.49.0>,crash_report,
     [[{initial_call,{couch_task_status,init,['Argument__1']}},
       {pid,<0.49.0>},
       {registered_name,couch_task_status},
       {error_info,
           {exit,
               {function_clause,
                   [{couch_task_status,handle_info,
                        [{#Ref<0.0.1832.61391>,3},nil]},
                    {gen_server,handle_msg,5},
                    {proc_lib,init_p_do_apply,3}]},
               [{gen_server,terminate,6},{proc_lib,init_p_do_apply,3}]}},
       {ancestors,[couch_primary_services,couch_server_sup,<0.1.0>]},
       {messages,[]},
       {links,[<0.45.0>]},
       {dictionary,[]},
       {trap_exit,false},
       {status,running},
       {heap_size,2584},
       {stack_size,24},
       {reductions,191624}],
      []]}}



> Replication tasks crash.
> ------------------------
>
>                 Key: COUCHDB-597
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-597
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Database Core
>    Affects Versions: 0.11
>            Reporter: Robert Newson
>
> If I kick off 10 replication tasks in quick succession, occasionally one or two of the
replication tasks will die and not be resumed. It seems that the stat tracking is a little
buggy, and under stress can eventually cause a permanent failure of the supervised replication
task;
> [Fri, 11 Dec 2009 19:00:08 GMT] [error] [<0.80.0>] {error_report,<0.30.0>,
>     {<0.80.0>,supervisor_report,
>      [{supervisor,{local,couch_rep_sup}},
>       {errorContext,shutdown_error},
>       {reason,killed},
>       {offender,
>           [{pid,<0.6700.11>},
>            {name,"fcbb13200a1618cf983b347f4d2c9835+create_target"},
>            {mfa,
>                {gen_server,start_link,
>                    [couch_rep,
>                     ["fcbb13200a1618cf983b347f4d2c9835",
>                      {[{<<"create_target">>,true},
>                        {<<"source">>,<<"http://node:5984/perf-p2">>},
>                        {<<"target">>,<<"perf-p2">>}]},
>                      {user_ctx,null,[<<"_admin">>]}],
>                     []]}},
>            {restart_type,temporary},
>            {shutdown,1},
>            {child_type,worker}]}]}}
> [Fri, 11 Dec 2009 19:00:08 GMT] [error] [emulator] Error in process <0.6705.11>
with exit value: {badarg,[{ets,insert,[stats_hit_table,{{couchdb,open_os_files},-1}]},{couch_stats_collector,decrement,1}]}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message