couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Fredrik Widlund (JIRA)" <j...@apache.org>
Subject [jira] Commented: (COUCHDB-722) Continuous replication tasks fail
Date Thu, 01 Apr 2010 19:55:27 GMT

    [ https://issues.apache.org/jira/browse/COUCHDB-722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12852523#action_12852523
] 

Fredrik Widlund commented on COUCHDB-722:
-----------------------------------------



The service-metrics database is also replicated, to the same target. The couchdb instances
are communicating directly to each other without any proxy, rewriting or address translating.

I'm afraid the entries from the last mail probably was a crash on the opposite instance. The
below should be from the same crash as the first one. This crash actually didn’t have the
completion of the service-metrics compact directly before it.

[info] [<0.20666.1>] 127.0.0.1 - - 'POST' /node-metrics/_ensure_full_commit?seq=288432
201
[info] [<0.274.0>] rebooting http://127.0.0.1:5984/node-metrics/ -> http://1.2.3.5:5984/node-metrics/
from last known repl\
ication checkpoint
[error] [<0.274.0>] ** Generic server <0.274.0> terminating
** Last message in was {'$gen_cast',do_checkpoint}
** When Server state == {state,<0.20538.1>,<0.20542.1>,<0.20545.1>,
                            <0.20547.1>,
                            {http_db,"http://127.0.0.1:5984/node-metrics/",
                                [],[],
                                [{"User-Agent","CouchDB/0.11.0"},
                                 {"Accept","application/json"},
                                 {"Accept-Encoding","gzip"}],
                                [],get,nil,
                                [{response_format,binary},
                                 {inactivity_timeout,30000}],
                                10,500,nil},
                            {http_db,
                                "http://1.2.3.5:5984/node-metrics/",[],
                                [],
                                [{"User-Agent","CouchDB/0.11.0"},
                                 {"Accept","application/json"},
                                 {"Accept-Encoding","gzip"}],
                                [],get,nil,
                                [{response_format,binary},
                                 {inactivity_timeout,30000}],
                                10,500,nil},
                            true,false,
                            ["f3e3081db5a215dbaf9b2984f0552090",
                             {[{<<"target">>,
                                <<"http://1.2.3.5:5984/node-metrics">>},
                               {<<"source">>,
                                <<"http://127.0.0.1:5984/node-metrics">>},
                               {<<"continuous">>,true}]},
                             {user_ctx,null,
                                 [<<"_admin">>],
                                 <<"{couch_httpd_auth, default_authentication_handler}">>}],
                            {1270124726131655,#Ref<0.0.11.78165>},
                            288246,
[...many, many session id entries]
                            [],false,288432,1163577,nil}
** Reason for termination ==
** {{badmatch,{stop,{db_not_found,<<"http://127.0.0.1:5984/node-metrics/">>}}},
    [{couch_rep,do_checkpoint,1},
     {couch_rep,handle_cast,2},
     {gen_server,handle_msg,5},
     {proc_lib,init_p_do_apply,3}]}

=ERROR REPORT==== 1-Apr-2010::14:25:26 ===
** Generic server <0.274.0> terminating
** Last message in was {'$gen_cast',do_checkpoint}
** When Server state == {state,<0.20538.1>,<0.20542.1>,<0.20545.1>,
                            <0.20547.1>,
                            {http_db,"http://127.0.0.1:5984/node-metrics/",
                                [],[],
                                [{"User-Agent","CouchDB/0.11.0"},
                                 {"Accept","application/json"},
                                 {"Accept-Encoding","gzip"}],
                                [],get,nil,
                                [{response_format,binary},
                                 {inactivity_timeout,30000}],
                                10,500,nil},
                            {http_db,
                                "http://1.2.3.5:5984/node-metrics/",[],
                                [],
                                [{"User-Agent","CouchDB/0.11.0"},
                                 {"Accept","application/json"},
                                 {"Accept-Encoding","gzip"}],
                                [],get,nil,
                                [{response_format,binary},
                                 {inactivity_timeout,30000}],
                                10,500,nil},
                            true,false,
                            ["f3e3081db5a215dbaf9b2984f0552090",
                             {[{<<"target">>,
                                <<"http://1.2.3.5:5984/node-metrics">>},
                               {<<"source">>,
                                <<"http://127.0.0.1:5984/node-metrics">>},
                               {<<"continuous">>,true}]},
                             {user_ctx,null,
                                 [<<"_admin">>],
                                 <<"{couch_httpd_auth, default_authentication_handler}">>}],
                            {1270124726131655,#Ref<0.0.11.78165>},
                            288246,
[...many, many session id entries]
                            [],false,288432,1163577,nil}
** Reason for termination ==
** {{badmatch,{stop,{db_not_found,<<"http://127.0.0.1:5984/node-metrics/">>}}},
    [{couch_rep,do_checkpoint,1},
     {couch_rep,handle_cast,2},
     {gen_server,handle_msg,5},
     {proc_lib,init_p_do_apply,3}]}
[error] [<0.274.0>] {error_report,<0.31.0>,
    {<0.274.0>,crash_report,
     [[{initial_call,{couch_rep,init,['Argument__1']}},
       {pid,<0.274.0>},
       {registered_name,[]},
       {error_info,
           {exit,
               {{badmatch,
                    {stop,
                        {db_not_found,
                            <<"http://127.0.0.1:5984/node-metrics/">>}}},
                [{couch_rep,do_checkpoint,1},
                 {couch_rep,handle_cast,2},
                 {gen_server,handle_msg,5},
                 {proc_lib,init_p_do_apply,3}]},
               [{gen_server,terminate,6},{proc_lib,init_p_do_apply,3}]}},
       {ancestors,
           [couch_rep_sup,couch_primary_services,couch_server_sup,<0.32.0>]},
       {messages,[{'EXIT',<0.21084.1>,normal}]},
       {links,[<0.81.0>]},
       {dictionary,[{task_status_update,{{1270,124726,124009},0}}]},
       {trap_exit,true},
       {status,running},
       {heap_size,10946},
       {stack_size,24},
       {reductions,29173458}],
      []]}}

=CRASH REPORT==== 1-Apr-2010::14:25:26 ===
[...follows below...]

Fredrik Widlund, CSO / Chief Architect, Qbrick
Direct: +46 8 459 90 32 | Mobile: +46 76 899 96 66

Södra Hamnvägen 22 | 115 41 STOCKHOLM
Web and mobile: www.qbrick.com

-----Ursprungligt meddelande-----
Från: Randall Leeds (JIRA) [mailto:jira@apache.org]
Skickat: den 1 april 2010 21:12
Till: Fredrik Widlund
Ämne: [jira] Commented: (COUCHDB-722) Continuous replication tasks fail


    [ https://issues.apache.org/jira/browse/COUCHDB-722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12852510#action_12852510
]

Randall Leeds commented on COUCHDB-722:
---------------------------------------

I'm rather confused.

The compaction seems to be on the service-metrics database, but the replication is between
databases named node-metrics.
However, there's a POST to /service-metrics/_missing_revs on the target database right around
the time compaction completes. Replication performs this operation. Are you using vhosts or
some kind of proxy layer that's rewriting any of your requests? Could you include a little
bit more context at the end where you put the ...? In particular I want to know if the replication
was using the service-metrics database at all.


--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.




> Continuous replication tasks fail
> ---------------------------------
>
>                 Key: COUCHDB-722
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-722
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 0.11
>         Environment: Arch Linux, CouchDB 0.11
>            Reporter: Fredrik Widlund
>
> Couchdb 0.11.0 replication tasks fails with the below after working for everything from
a few minutes to an hour. The below replication is of the type {"source":"http://127.0.0.1:5984/node-metrics",
"target":"http://1.2.3.4:5984/node-metrics", "continuous":true} and the node-metrics database
exist on both machines.
> The database is periodically compacted which, and I'm speculating here, could be a contributing
factor to the crash.
> Kind regards,
> Fredrik Widlund
> =CRASH REPORT==== 1-Apr-2010::14:25:26 ===
>   crasher:
>     initial call: couch_rep:init/1
>     pid: <0.274.0>
>     registered_name: []
>     exception exit: {{badmatch,
>                          {stop,
>                              {db_not_found,
>                                  <<"http://127.0.0.1:5984/node-metrics/">>}}},
>                      [{couch_rep,do_checkpoint,1},
>                       {couch_rep,handle_cast,2},
>                       {gen_server,handle_msg,5},
>                       {proc_lib,init_p_do_apply,3}]}
>       in function  gen_server:terminate/6
>     ancestors: [couch_rep_sup,couch_primary_services,couch_server_sup,
>                   <0.32.0>]
>     messages: [{'EXIT',<0.21084.1>,normal}]
>     links: [<0.81.0>]
>     dictionary: [{task_status_update,{{1270,124726,124009},0}}]
>     trap_exit: true
>     status: running
>     heap_size: 10946
>     stack_size: 24
>     reductions: 29173458
>   neighbours:
> [error] [<0.81.0>] {error_report,<0.31.0>,
>     {<0.81.0>,supervisor_report,
>      [{supervisor,{local,couch_rep_sup}},
>       {errorContext,child_terminated},
>       {reason,
>           {{badmatch,
>                {stop,
>                    {db_not_found,<<"http://127.0.0.1:5984/node-metrics/">>}}},
>            [{couch_rep,do_checkpoint,1},
>             {couch_rep,handle_cast,2},
>             {gen_server,handle_msg,5},
>             {proc_lib,init_p_do_apply,3}]}},
>       {offender,
>           [{pid,<0.274.0>},
>            {name,"f3e3081db5a215dbaf9b2984f0552090+continuous"},
>            {mfa,
>                {gen_server,start_link,
>                    [couch_rep,
>                     ["f3e3081db5a215dbaf9b2984f0552090",
>                      {[{<<"target">>,
>                         <<"http://1.2.3.4:5984/node-metrics">>},
>                        {<<"source">>,<<"http://127.0.0.1:5984/node-metrics">>},
>                        {<<"continuous">>,true}]},
>                      {user_ctx,null,
>                          [<<"_admin">>],
>                          <<"{couch_httpd_auth, default_authentication_handler}">>}],
>                     []]}},
>            {restart_type,temporary},
>            {shutdown,1},
>            {child_type,worker}]}]}}
> =SUPERVISOR REPORT==== 1-Apr-2010::14:25:26 ===
>      Supervisor: {local,couch_rep_sup}
>      Context:    child_terminated
>      Reason:     {{badmatch,
>                       {stop,
>                           {db_not_found,
>                               <<"http://127.0.0.1:5984/node-metrics/">>}}},
>                   [{couch_rep,do_checkpoint,1},
>                    {couch_rep,handle_cast,2},
>                    {gen_server,handle_msg,5},
>                    {proc_lib,init_p_do_apply,3}]}
>      Offender:   [{pid,<0.274.0>},
>                   {name,"f3e3081db5a215dbaf9b2984f0552090+continuous"},
>                   {mfa,
>                       {gen_server,start_link,
>                           [couch_rep,
>                            ["f3e3081db5a215dbaf9b2984f0552090",
>                             {[{<<"target">>,
>                                <<"http://1.2.3.4:5984/node-metrics">>},
>                               {<<"source">>,
>                                <<"http://127.0.0.1:5984/node-metrics">>},
>                               {<<"continuous">>,true}]},
>                             {user_ctx,null,
>                                 [<<"_admin">>],
>                                 <<"{couch_httpd_auth, default_authentication_handler}">>}],
>                            []]}},
>                   {restart_type,temporary},
>                   {shutdown,1},
>                   {child_type,worker}]

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message