Return-Path: Delivered-To: apmail-couchdb-dev-archive@www.apache.org Received: (qmail 81988 invoked from network); 1 Apr 2010 19:55:49 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 1 Apr 2010 19:55:49 -0000 Received: (qmail 22983 invoked by uid 500); 1 Apr 2010 19:55:49 -0000 Delivered-To: apmail-couchdb-dev-archive@couchdb.apache.org Received: (qmail 22954 invoked by uid 500); 1 Apr 2010 19:55:49 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 22946 invoked by uid 99); 1 Apr 2010 19:55:49 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 01 Apr 2010 19:55:49 +0000 X-ASF-Spam-Status: No, hits=-1196.3 required=10.0 tests=ALL_TRUSTED,AWL,FS_REPLICA,NORMAL_HTTP_TO_IP,WEIRD_PORT X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 01 Apr 2010 19:55:47 +0000 Received: from brutus.apache.org (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 7D13F234C4B2 for ; Thu, 1 Apr 2010 19:55:27 +0000 (UTC) Message-ID: <5185554.641331270151727511.JavaMail.jira@brutus.apache.org> Date: Thu, 1 Apr 2010 19:55:27 +0000 (UTC) From: "Fredrik Widlund (JIRA)" To: dev@couchdb.apache.org Subject: [jira] Commented: (COUCHDB-722) Continuous replication tasks fail In-Reply-To: <1062064667.635281270131507230.JavaMail.jira@brutus.apache.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/COUCHDB-722?page=3Dcom.atlassia= n.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D128= 52523#action_12852523 ]=20 Fredrik Widlund commented on COUCHDB-722: ----------------------------------------- The service-metrics database is also replicated, to the same target. The co= uchdb instances are communicating directly to each other without any proxy,= rewriting or address translating. I'm afraid the entries from the last mail probably was a crash on the oppos= ite instance. The below should be from the same crash as the first one. Thi= s crash actually didn=E2=80=99t have the completion of the service-metrics = compact directly before it. [info] [<0.20666.1>] 127.0.0.1 - - 'POST' /node-metrics/_ensure_full_commit= ?seq=3D288432 201 [info] [<0.274.0>] rebooting http://127.0.0.1:5984/node-metrics/ -> http://= 1.2.3.5:5984/node-metrics/ from last known repl\ ication checkpoint [error] [<0.274.0>] ** Generic server <0.274.0> terminating ** Last message in was {'$gen_cast',do_checkpoint} ** When Server state =3D=3D {state,<0.20538.1>,<0.20542.1>,<0.20545.1>, <0.20547.1>, {http_db,"http://127.0.0.1:5984/node-metrics/", [],[], [{"User-Agent","CouchDB/0.11.0"}, {"Accept","application/json"}, {"Accept-Encoding","gzip"}], [],get,nil, [{response_format,binary}, {inactivity_timeout,30000}], 10,500,nil}, {http_db, "http://1.2.3.5:5984/node-metrics/",[], [], [{"User-Agent","CouchDB/0.11.0"}, {"Accept","application/json"}, {"Accept-Encoding","gzip"}], [],get,nil, [{response_format,binary}, {inactivity_timeout,30000}], 10,500,nil}, true,false, ["f3e3081db5a215dbaf9b2984f0552090", {[{<<"target">>, <<"http://1.2.3.5:5984/node-metrics">>}, {<<"source">>, <<"http://127.0.0.1:5984/node-metrics">>}, {<<"continuous">>,true}]}, {user_ctx,null, [<<"_admin">>], <<"{couch_httpd_auth, default_authenticati= on_handler}">>}], {1270124726131655,#Ref<0.0.11.78165>}, 288246, [...many, many session id entries] [],false,288432,1163577,nil} ** Reason for termination =3D=3D ** {{badmatch,{stop,{db_not_found,<<"http://127.0.0.1:5984/node-metrics/">>= }}}, [{couch_rep,do_checkpoint,1}, {couch_rep,handle_cast,2}, {gen_server,handle_msg,5}, {proc_lib,init_p_do_apply,3}]} =3DERROR REPORT=3D=3D=3D=3D 1-Apr-2010::14:25:26 =3D=3D=3D ** Generic server <0.274.0> terminating ** Last message in was {'$gen_cast',do_checkpoint} ** When Server state =3D=3D {state,<0.20538.1>,<0.20542.1>,<0.20545.1>, <0.20547.1>, {http_db,"http://127.0.0.1:5984/node-metrics/", [],[], [{"User-Agent","CouchDB/0.11.0"}, {"Accept","application/json"}, {"Accept-Encoding","gzip"}], [],get,nil, [{response_format,binary}, {inactivity_timeout,30000}], 10,500,nil}, {http_db, "http://1.2.3.5:5984/node-metrics/",[], [], [{"User-Agent","CouchDB/0.11.0"}, {"Accept","application/json"}, {"Accept-Encoding","gzip"}], [],get,nil, [{response_format,binary}, {inactivity_timeout,30000}], 10,500,nil}, true,false, ["f3e3081db5a215dbaf9b2984f0552090", {[{<<"target">>, <<"http://1.2.3.5:5984/node-metrics">>}, {<<"source">>, <<"http://127.0.0.1:5984/node-metrics">>}, {<<"continuous">>,true}]}, {user_ctx,null, [<<"_admin">>], <<"{couch_httpd_auth, default_authenticati= on_handler}">>}], {1270124726131655,#Ref<0.0.11.78165>}, 288246, [...many, many session id entries] [],false,288432,1163577,nil} ** Reason for termination =3D=3D ** {{badmatch,{stop,{db_not_found,<<"http://127.0.0.1:5984/node-metrics/">>= }}}, [{couch_rep,do_checkpoint,1}, {couch_rep,handle_cast,2}, {gen_server,handle_msg,5}, {proc_lib,init_p_do_apply,3}]} [error] [<0.274.0>] {error_report,<0.31.0>, {<0.274.0>,crash_report, [[{initial_call,{couch_rep,init,['Argument__1']}}, {pid,<0.274.0>}, {registered_name,[]}, {error_info, {exit, {{badmatch, {stop, {db_not_found, <<"http://127.0.0.1:5984/node-metrics/">>}}}, [{couch_rep,do_checkpoint,1}, {couch_rep,handle_cast,2}, {gen_server,handle_msg,5}, {proc_lib,init_p_do_apply,3}]}, [{gen_server,terminate,6},{proc_lib,init_p_do_apply,3}]}}, {ancestors, [couch_rep_sup,couch_primary_services,couch_server_sup,<0.32.0>]= }, {messages,[{'EXIT',<0.21084.1>,normal}]}, {links,[<0.81.0>]}, {dictionary,[{task_status_update,{{1270,124726,124009},0}}]}, {trap_exit,true}, {status,running}, {heap_size,10946}, {stack_size,24}, {reductions,29173458}], []]}} =3DCRASH REPORT=3D=3D=3D=3D 1-Apr-2010::14:25:26 =3D=3D=3D [...follows below...] Fredrik Widlund, CSO / Chief Architect, Qbrick Direct: +46 8 459 90 32 | Mobile: +46 76 899 96 66 S=C3=B6dra Hamnv=C3=A4gen 22 | 115 41 STOCKHOLM Web and mobile: www.qbrick.com -----Ursprungligt meddelande----- Fr=C3=A5n: Randall Leeds (JIRA) [mailto:jira@apache.org] Skickat: den 1 april 2010 21:12 Till: Fredrik Widlund =C3=84mne: [jira] Commented: (COUCHDB-722) Continuous replication tasks fai= l [ https://issues.apache.org/jira/browse/COUCHDB-722?page=3Dcom.atlassia= n.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D128= 52510#action_12852510 ] Randall Leeds commented on COUCHDB-722: --------------------------------------- I'm rather confused. The compaction seems to be on the service-metrics database, but the replica= tion is between databases named node-metrics. However, there's a POST to /service-metrics/_missing_revs on the target dat= abase right around the time compaction completes. Replication performs this= operation. Are you using vhosts or some kind of proxy layer that's rewriti= ng any of your requests? Could you include a little bit more context at the= end where you put the ...? In particular I want to know if the replication= was using the service-metrics database at all. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. > Continuous replication tasks fail > --------------------------------- > > Key: COUCHDB-722 > URL: https://issues.apache.org/jira/browse/COUCHDB-722 > Project: CouchDB > Issue Type: Bug > Components: Replication > Affects Versions: 0.11 > Environment: Arch Linux, CouchDB 0.11 > Reporter: Fredrik Widlund > > Couchdb 0.11.0 replication tasks fails with the below after working for e= verything from a few minutes to an hour. The below replication is of the ty= pe {"source":"http://127.0.0.1:5984/node-metrics", "target":"http://1.2.3.4= :5984/node-metrics", "continuous":true} and the node-metrics database exist= on both machines. > The database is periodically compacted which, and I'm speculating here, c= ould be a contributing factor to the crash. > Kind regards, > Fredrik Widlund > =3DCRASH REPORT=3D=3D=3D=3D 1-Apr-2010::14:25:26 =3D=3D=3D > crasher: > initial call: couch_rep:init/1 > pid: <0.274.0> > registered_name: [] > exception exit: {{badmatch, > {stop, > {db_not_found, > <<"http://127.0.0.1:5984/node-metrics/">= >}}}, > [{couch_rep,do_checkpoint,1}, > {couch_rep,handle_cast,2}, > {gen_server,handle_msg,5}, > {proc_lib,init_p_do_apply,3}]} > in function gen_server:terminate/6 > ancestors: [couch_rep_sup,couch_primary_services,couch_server_sup, > <0.32.0>] > messages: [{'EXIT',<0.21084.1>,normal}] > links: [<0.81.0>] > dictionary: [{task_status_update,{{1270,124726,124009},0}}] > trap_exit: true > status: running > heap_size: 10946 > stack_size: 24 > reductions: 29173458 > neighbours: > [error] [<0.81.0>] {error_report,<0.31.0>, > {<0.81.0>,supervisor_report, > [{supervisor,{local,couch_rep_sup}}, > {errorContext,child_terminated}, > {reason, > {{badmatch, > {stop, > {db_not_found,<<"http://127.0.0.1:5984/node-metrics/">= >}}}, > [{couch_rep,do_checkpoint,1}, > {couch_rep,handle_cast,2}, > {gen_server,handle_msg,5}, > {proc_lib,init_p_do_apply,3}]}}, > {offender, > [{pid,<0.274.0>}, > {name,"f3e3081db5a215dbaf9b2984f0552090+continuous"}, > {mfa, > {gen_server,start_link, > [couch_rep, > ["f3e3081db5a215dbaf9b2984f0552090", > {[{<<"target">>, > <<"http://1.2.3.4:5984/node-metrics">>}, > {<<"source">>,<<"http://127.0.0.1:5984/node-metric= s">>}, > {<<"continuous">>,true}]}, > {user_ctx,null, > [<<"_admin">>], > <<"{couch_httpd_auth, default_authentication_han= dler}">>}], > []]}}, > {restart_type,temporary}, > {shutdown,1}, > {child_type,worker}]}]}} > =3DSUPERVISOR REPORT=3D=3D=3D=3D 1-Apr-2010::14:25:26 =3D=3D=3D > Supervisor: {local,couch_rep_sup} > Context: child_terminated > Reason: {{badmatch, > {stop, > {db_not_found, > <<"http://127.0.0.1:5984/node-metrics/">>}}= }, > [{couch_rep,do_checkpoint,1}, > {couch_rep,handle_cast,2}, > {gen_server,handle_msg,5}, > {proc_lib,init_p_do_apply,3}]} > Offender: [{pid,<0.274.0>}, > {name,"f3e3081db5a215dbaf9b2984f0552090+continuous"}, > {mfa, > {gen_server,start_link, > [couch_rep, > ["f3e3081db5a215dbaf9b2984f0552090", > {[{<<"target">>, > <<"http://1.2.3.4:5984/node-metrics">>}, > {<<"source">>, > <<"http://127.0.0.1:5984/node-metrics">>}, > {<<"continuous">>,true}]}, > {user_ctx,null, > [<<"_admin">>], > <<"{couch_httpd_auth, default_authenticat= ion_handler}">>}], > []]}}, > {restart_type,temporary}, > {shutdown,1}, > {child_type,worker}] --=20 This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.