incubator-couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Lehnardt <...@apache.org>
Subject Re: replication problems
Date Wed, 10 Oct 2012 20:40:19 GMT
flagged.

On Oct 10, 2012, at 22:34 , Robert Newson <robert.newson@gmail.com> wrote:

> Jan,
> 
> Flag that as fix-for 1.3? I don't have my creds on my phone to do it.
> 
> I like the ini uuid idea best, modelled after the cookie with secret.
> If we have the uuid, we'd omit host name as well as port, right?
> 
> Sent from the ocean floor
> 
> On 10 Oct 2012, at 21:12, Jan Lehnardt <jan@apache.org> wrote:
> 
>> Filipe tells me this is https://issues.apache.org/jira/browse/COUCHDB-1259
>> 
>> Cheers
>> Jan
>> --
>> 
>> On Oct 4, 2012, at 02:28 , Dustin Sallings <dustin@spy.net> wrote:
>> 
>>> 
>>>   I'm bringing this back up as requested.  I'm currently simultaneously in the
"not replicating interesting things" and "has duplicate replicates state".  I think the stuff
below shows the "not replicating" stuff.
>>> 
>>>   Active tasks shows the other (these are based on replicator DB documents (example
below):
>>> 
>>> [
>>>  {
>>>      "checkpointed_source_seq": 2022317,
>>>      "continuous": true,
>>>      "doc_id": "cbstats-from-dogbowl",
>>>      "doc_write_failures": 0,
>>>      "docs_read": 300,
>>>      "docs_written": 300,
>>>      "missing_revisions_found": 300,
>>>      "pid": "<0.10466.12>",
>>>      "progress": 100,
>>>      "replication_id": "50daecd0a29f4b7e5d102990831f3d64+continuous",
>>>      "revisions_checked": 304,
>>>      "source": "http://dustin:*****@single.couchbase.net/cbstats/",
>>>      "source_seq": 2022317,
>>>      "started_on": 1349309457,
>>>      "target": "cbstats",
>>>      "type": "replication",
>>>      "updated_on": 1349310442
>>>  },
>>>  {
>>>      "checkpointed_source_seq": 2022317,
>>>      "continuous": true,
>>>      "doc_id": "cbstats-from-dogbowl",
>>>      "doc_write_failures": 0,
>>>      "docs_read": 62,
>>>      "docs_written": 62,
>>>      "missing_revisions_found": 62,
>>>      "pid": "<0.11019.12>",
>>>      "progress": 100,
>>>      "replication_id": "411e341d5aa9a3fe636cf4ea8ba71720+continuous",
>>>      "revisions_checked": 304,
>>>      "source": "http://dustin:*****@single.couchbase.net/cbstats/",
>>>      "source_seq": 2022317,
>>>      "started_on": 1349309471,
>>>      "target": "cbstats",
>>>      "type": "replication",
>>>      "updated_on": 1349310443
>>>  },
>>>  {
>>>      "checkpointed_source_seq": 107068,
>>>      "continuous": true,
>>>      "doc_id": "gerrit-from-prod",
>>>      "doc_write_failures": 0,
>>>      "docs_read": 22,
>>>      "docs_written": 22,
>>>      "missing_revisions_found": 22,
>>>      "pid": "<0.11086.12>",
>>>      "progress": 100,
>>>      "replication_id": "4a21031dac0d81637a23c32bad620be9+continuous",
>>>      "revisions_checked": 26,
>>>      "source": "http://dustinphoto.iriscouch.com/gerrit/",
>>>      "source_seq": 107068,
>>>      "started_on": 1349309487,
>>>      "target": "gerrit",
>>>      "type": "replication",
>>>      "updated_on": 1349310445
>>>  },
>>>  {
>>>      "checkpointed_source_seq": 107068,
>>>      "continuous": true,
>>>      "doc_id": "gerrit-from-prod",
>>>      "doc_write_failures": 0,
>>>      "docs_read": 17,
>>>      "docs_written": 17,
>>>      "missing_revisions_found": 17,
>>>      "pid": "<0.11107.12>",
>>>      "progress": 100,
>>>      "replication_id": "b4ad5d3f2e5b78670e4c8364b18000e9+continuous",
>>>      "revisions_checked": 26,
>>>      "source": "http://dustinphoto.iriscouch.com/gerrit/",
>>>      "source_seq": 107068,
>>>      "started_on": 1349309488,
>>>      "target": "gerrit",
>>>      "type": "replication",
>>>      "updated_on": 1349310445
>>>  }
>>> ]
>>> 
>>> 
>>>   The replicator document for the latter, for example is this:
>>> 
>>> {
>>> "_id": "gerrit-from-prod",
>>> "_rev": "2235-36de10fb757581a1782dacbb26ee4809",
>>> "source": "http://dustinphoto.iriscouch.com/gerrit",
>>> "target": "gerrit",
>>> "continuous": true,
>>> "user_ctx": {
>>>     "roles": [
>>>         "_admin"
>>>     ]
>>> },
>>> "_replication_state_time": "2012-10-03T17:11:27-07:00",
>>> "_replication_id": "b4ad5d3f2e5b78670e4c8364b18000e9",
>>> "_replication_state": "triggered"
>>> }
>>> 
>>> 
>>> Begin forwarded message:
>>> 
>>>> From: Dustin Sallings <dustin@spy.net>
>>>> Subject: Re: replication problems
>>>> Date: June 15, 2012 0:10:04 PDT
>>>> To: dev@couchdb.apache.org
>>>> Reply-To: dev@couchdb.apache.org
>>>> 
>>>> 
>>>> On Jun 14, 2012, at 11:28 PM, Benoit Chesneau wrote:
>>>> 
>>>>> Ar you using _replicate or _replicator ? Anything interresting in logs?
>>>> 
>>>> 
>>>>   I'm using _replicator (wonderful feature, I just kill the DB and everything
goes back the way I want it).
>>>> 
>>>>   Hmm...  I do think I found some stuff digging through the logs.  This is
the local DB I noticed not doing its thing, although there were tons of errors all around
this.  Looks like the server got into some kind of bad state and sort of half-crashed.
>>>> 
>>>> 
>>>> [Thu, 14 Jun 2012 23:20:12 GMT] [error] [<0.133.0>] Replication `ae601df0373da82d1b4a9ff741c8ba18+continuous`
(`rpics` -> `rpics-processed`) failed: {{timeout,{gen_server,call,[<0.213.0>,{open_ref_count,<0.4
>>>> 42.0>}]}},
>>>> {gen_server,call,
>>>>          [couch_server,
>>>>           {open,<<"rpics">>,
>>>>                 [{user_ctx,{user_ctx,null,[<<"_admin">>],undefined}}]},
>>>>           infinity]}}
>>>> [Thu, 14 Jun 2012 23:20:25 GMT] [error] [<0.383.0>] ** Generic server
<0.383.0> terminating
>>>> ** Last message in was {'EXIT',<0.384.0>,
>>>>                     {{timeout,
>>>>                       {gen_server,call,
>>>>                        [<0.213.0>,{open_ref_count,<0.442.0>}]}},
>>>>                      {gen_server,call,
>>>>                       [couch_server,
>>>>                        {open,<<"cbstats">>,
>>>>                         [{user_ctx,
>>>>                           {user_ctx,null,[<<"_admin">>],undefined}},
>>>>                          {user_ctx,
>>>>                           {user_ctx,null,[<<"_admin">>],undefined}}]},
>>>>                        infinity]}}}
>>>> 
>>>> ** When Server state == {state,<0.272.0>,<0.384.0>,20,
>>>>                      {httpdb,
>>>>                       "http://dustin:LOGGED_PASSWORD@single.couchbase.net/cbstats/",
>>>>                       nil,
>>>>                       [{"Accept","application/json"},
>>>>                        {"User-Agent","CouchDB/1.2.0"}],
>>>>                       30000,
>>>>                       [{socket_options,
>>>>                         [{keepalive,true},{nodelay,false}]}],
>>>>                       10,250,<0.273.0>,20},
>>>>                      {db,<0.288.0>,<0.289.0>,nil,<<"1339637701848579">>,
>>>>                       <0.290.0>,<0.286.0>,<0.367.0>,
>>>>                       {db_header,6,984356,0,
>>>>                        {860345646,{737369,975,640891414},59433736},
>>>>                        {860348005,738344,42056446},
>>>>                        {860352635,[],5737},
>>>>                        0,nil,nil,1000},
>>>>                       984356,
>>>>                       {btree,<0.286.0>,
>>>>                        {860345646,{737369,975,640891414},59433736},
>>>>                        #Fun<couch_db_updater.10.57960608>,
>>>>                        #Fun<couch_db_updater.11.57960608>,
>>>>                        #Fun<couch_btree.5.133731799>,
>>>>                        #Fun<couch_db_updater.12.57960608>,snappy},
>>>>                       {btree,<0.286.0>,
>>>>                        {860348005,738344,42056446},
>>>>                        #Fun<couch_db_updater.13.57960608>,
>>>>                        #Fun<couch_db_updater.14.57960608>,
>>>>                        #Fun<couch_btree.5.133731799>,
>>>>                        #Fun<couch_db_updater.15.57960608>,snappy},
>>>>                       {btree,<0.286.0>,
>>>>                        {860352635,[],5737},
>>>>                        #Fun<couch_btree.3.133731799>,
>>>>                        #Fun<couch_btree.4.133731799>,
>>>>                        #Fun<couch_btree.5.133731799>,nil,snappy},
>>>>                       984356,<<"cbstats">>,
>>>>                       "/Volumes/terror/db/couchdb/cbstats.couch",[],[],
>>>>                       nil,
>>>>                       {user_ctx,null,[<<"_admin">>],undefined},
>>>>                       nil,1000,
>>>>                       [before_header,after_header,on_file_open],
>>>>                       [{user_ctx,
>>>>                         {user_ctx,null,[<<"_admin">>],undefined}}],
>>>>                       snappy,nil,nil},
>>>>                      [],nil,nil,nil,
>>>>                      {rep_stats,0,0,0,0,0},
>>>>                      nil,<0.385.0>,
>>>>                      {batch,[],0}}
>>>> ** Reason for termination ==
>>>> ** {noproc,{gen_server,call,[<0.367.0>,{drop,<0.383.0>},infinity]}}
>>>> 
>>>> 
>>>> 
>>>> 
>>>>   Scrolling to the beginning of the errors, I find this:
>>>> 
>>>> 
>>>> [Thu, 14 Jun 2012 23:15:54 GMT] [error] [<0.164.0>] Replication `543f76281e8d52d6ce5b51fddf0588e7+continuous`
(`photo` -> `http://dustin:*****@dustinphoto.couchone.com/photo/`) failed: source_db_down
>>>> [Thu, 14 Jun 2012 23:18:57 GMT] [info] [<0.358.0>] 127.0.0.1 - - GET
/_all_dbs 200
>>>> [Thu, 14 Jun 2012 23:19:52 GMT] [error] [<0.289.0>] ** Generic server
<0.289.0> terminating
>>>> ** Last message in was {update_docs,<0.272.0>,[],
>>>>                        [{{doc,
>>>>                              <<"_local/c4cc070f896d7267e52ba012856fed4b">>,
>>>>                              {0,[<<"346185">>]},
>>>>                              {[{<<"session_id">>,
>>>>                                 <<"9fb3475683d44bb1e151031dd42cc59f">>},
>>>>                                {<<"source_last_seq">>,1419004},
>>>>                                {<<"replication_id_version">>,2},
>>>>                                {<<"history">>,
>>>>                                 [{[{<<"session_id">>,
>>>>                                     <<"9fb3475683d44bb1e151031dd42cc59f">>},
>>>>                                    {<<"start_time">>,
>>>>                                     <<"Thu, 14 Jun 2012 01:35:02 GMT">>},
>>>>                                    {<<"end_time">>,
>>>>                                     <<"Thu, 14 Jun 2012 23:15:29 GMT">>},
>>>>                                    {<<"start_last_seq">>,1410146},
>>>>                                    {<<"end_last_seq">>,1419004},
>>>>                                    {<<"recorded_seq">>,1419004},
>>>>                                    {<<"missing_checked">>,8100},
>>>>                                    {<<"missing_found">>,8100},
>>>>                                    {<<"docs_read">>,8100},
>>>>                                    {<<"docs_written">>,8100},
>>>>                                    {<<"doc_write_failures">>,0}]},
>>>>                                  {[{<<"session_id">>,
>>>>                                     <<"3edd7c50327eab7ec0768451e34efa8b">>},
>>>>                                    {<<"start_time">>,
>>>>                                     <<"Tue, 12 Jun 2012 05:51:17 GMT">>},
>>>>                                    {<<"end_time">>,
>>>>                                     <<"Tue, 12 Jun 2012 13:02:37 GMT">>},
>>>>                                    {<<"start_last_seq">>,1407186},
>>>>                                    {<<"end_last_seq">>,1410146},
>>>>                                    {<<"recorded_seq">>,1410146},
>>>>                                    {<<"missing_checked">>,2583},
>>>>                                    {<<"missing_found">>,2577},
>>>>                                    {<<"docs_read">>,2577},
>>>>                                    {<<"docs_written">>,2577},
>>>>                                    {<<"doc_write_failures">>,0}]},
>>>>                                  {[{<<"session_id">>,
>>>>                                     <<"172de62044281a01b1584a9d099f42af">>},
>>>>                                    {<<"start_time">>,
>>>>                                     <<"Mon, 11 Jun 2012 03:40:11 GMT">>},
>>>>                                    {<<"end_time">>,
>>>>                                     <<"Mon, 11 Jun 2012 15:16:24 GMT">>},
>>>>                                    {<<"start_last_seq">>,1405428},
>>>>                                    {<<"end_last_seq">>,1407186},
>>>>                                    {<<"recorded_seq">>,1407186},
>>>>                                    {<<"missing_checked">>,1721},
>>>>                                    {<<"missing_found">>,1721},
>>>>                                    {<<"docs_read">>,1721},
>>>>                                    {<<"docs_written">>,1721},
>>>>                                    {<<"doc_write_failures">>,0}]},
>>>>                                  {[{<<"session_id">>,
>>>>                                     <<"e60a126a2036c5fab00a1249101820c8">>},
>>>>                                    {<<"start_time">>,
>>>>                                     <<"Sat, 09 Jun 2012 07:47:22 GMT">>},
>>>>                                    {<<"end_time">>,
>>>>                                     <<"Sun, 10 Jun 2012 21:16:20 GMT">>},
>>>>                                    {<<"start_last_seq">>,1386289},
>>>>                                    {<<"end_last_seq">>,1405428},
>>>>                                    {<<"recorded_seq">>,1405428},
>>>>                                    {<<"missing_checked">>,16977},
>>>>                                    {<<"missing_found">>,16977},
>>>>                                    {<<"docs_read">>,16977},
>>>>                                    {<<"docs_written">>,16977},
>>>>                                    {<<"doc_write_failures">>,0}]},
>>>>                                  {[{<<"session_id">>,
>>>>                                     <<"ef3e4333d340dcf73ddfa3fe8c720042">>},
>>>>                                    {<<"start_time">>,
>>>>                                     <<"Mon, 04 Jun 2012 02:39:44 GMT">>},
>>>>                                    {<<"end_time">>,
>>>>                                     <<"Mon, 04 Jun 2012 12:35:50 GMT">>},
>>>>                                    {<<"start_last_seq">>,1384738},
>>>>                                    {<<"end_last_seq">>,1386289},
>>>>                                    {<<"recorded_seq">>,1386289},
>>>>                                    {<<"missing_checked">>,1551},
>>>>                                    {<<"missing_found">>,1550},
>>>>                                    {<<"docs_read">>,1550},
>>>>                                    {<<"docs_written">>,1550},
>>>>                                    {<<"doc_write_failures">>,0}]},
>>>>                                  {[{<<"session_id">>,
>>>>                                     <<"d5123a3caf462794aaf5a47be1bb3b6e">>},
>>>>                                    {<<"start_time">>,
>>>>                                     <<"Wed, 30 May 2012 20:41:43 GMT">>},
>>>>                                    {<<"end_time">>,
>>>>                                     <<"Mon, 04 Jun 2012 02:37:33 GMT">>},
>>>>                                    {<<"start_last_seq">>,1372404},
>>>>                                    {<<"end_last_seq">>,1384738},
>>>>                                    {<<"recorded_seq">>,1384738},
>>>>                                    {<<"missing_checked">>,12334},
>>>>                                    {<<"missing_found">>,12333},
>>>>                                    {<<"docs_read">>,12333},
>>>>                                    {<<"docs_written">>,12333},
>>>>                                    {<<"doc_write_failures">>,0}]},
>>>>                                  {[{<<"session_id">>,
>>>>                                     <<"52a16e8832f70dc094f6fff5e9b7d75b">>},
>>>>                                    {<<"start_time">>,
>>>>                                     <<"Sun, 27 May 2012 23:36:41 GMT">>},
>>>>                                    {<<"end_time">>,
>>>>                                     <<"Wed, 30 May 2012 20:40:14 GMT">>},
>>>>                                    {<<"start_last_seq">>,1361049},
>>>>                                    {<<"end_last_seq">>,1372404},
>>>>                                    {<<"recorded_seq">>,1372404},
>>>>                                    {<<"missing_checked">>,11355},
>>>>                                    {<<"missing_found">>,11355},
>>>>                                    {<<"docs_read">>,11355},
>>>>                                    {<<"docs_written">>,11355},
>>>>                                    {<<"doc_write_failures">>,0}]},
>>>> [...lots of these...]
>>>> 
>>>>                              [],false,[]},
>>>>                          #Ref<0.0.15.159973>}],
>>>>                        false,false}
>>>> ** When Server state == {db,<0.288.0>,<0.289.0>,nil,<<"1339637701848579">>,
>>>>                         <0.290.0>,<0.286.0>,<0.367.0>,
>>>>                         {db_header,6,992456,0,
>>>>                             {943280145,{744250,975,647546641},60017672},
>>>>                             {943282327,745225,42485979},
>>>>                             {943267963,[],5753},
>>>>                             0,nil,nil,1000},
>>>>                         992456,
>>>>                         {btree,<0.286.0>,
>>>>                             {943280145,{744250,975,647546641},60017672},
>>>>                             #Fun<couch_db_updater.10.57960608>,
>>>>                             #Fun<couch_db_updater.11.57960608>,
>>>>                             #Fun<couch_btree.5.133731799>,
>>>>                             #Fun<couch_db_updater.12.57960608>,snappy},
>>>>                         {btree,<0.286.0>,
>>>>                             {943282327,745225,42485979},
>>>>                             #Fun<couch_db_updater.13.57960608>,
>>>>                             #Fun<couch_db_updater.14.57960608>,
>>>>                             #Fun<couch_btree.5.133731799>,
>>>>                             #Fun<couch_db_updater.15.57960608>,snappy},
>>>>                         {btree,<0.286.0>,
>>>>                             {943267963,[],5753},
>>>>                             #Fun<couch_btree.3.133731799>,
>>>>                             #Fun<couch_btree.4.133731799>,
>>>>                             #Fun<couch_btree.5.133731799>,nil,snappy},
>>>>                         992456,<<"cbstats">>,
>>>>                         "/Volumes/terror/db/couchdb/cbstats.couch",[],[],
>>>>                         nil,
>>>>                         {user_ctx,null,[],undefined},
>>>>                         nil,1000,
>>>>                         [before_header,after_header,on_file_open],
>>>>                         [{user_ctx,
>>>>                              {user_ctx,null,[<<"_admin">>],undefined}}],
>>>>                         snappy,nil,nil}
>>>> ** Reason for termination ==
>>>> ** {timeout,
>>>>    {gen_server,call,
>>>>        [<0.288.0>,
>>>>         {db_updated,
>>>>             {db,<0.288.0>,<0.289.0>,nil,<<"1339637701848579">>,<0.290.0>,
>>>>                 <0.286.0>,<0.367.0>,
>>>>                 {db_header,6,992456,0,
>>>>                     {943280145,{744250,975,647546641},60017672},
>>>>                     {943282327,745225,42485979},
>>>>                     {943267963,[],5753},
>>>>                     0,nil,nil,1000},
>>>>                 992456,
>>>>                 {btree,<0.286.0>,
>>>>                     {943280145,{744250,975,647546641},60017672},
>>>>                     #Fun<couch_db_updater.10.57960608>,
>>>>                     #Fun<couch_db_updater.11.57960608>,
>>>>                     #Fun<couch_btree.5.133731799>,
>>>>                     #Fun<couch_db_updater.12.57960608>,snappy},
>>>>                 {btree,<0.286.0>,
>>>>                     {943282327,745225,42485979},
>>>>                     #Fun<couch_db_updater.13.57960608>,
>>>>                     #Fun<couch_db_updater.14.57960608>,
>>>>                     #Fun<couch_btree.5.133731799>,
>>>>                     #Fun<couch_db_updater.15.57960608>,snappy},
>>>>                 {btree,<0.286.0>,
>>>>                     {943284347,[],5756},
>>>>                     #Fun<couch_btree.3.133731799>,
>>>>                     #Fun<couch_btree.4.133731799>,
>>>>                     #Fun<couch_btree.5.133731799>,nil,snappy},
>>>>                 992456,<<"cbstats">>,
>>>>                 "/Volumes/terror/db/couchdb/cbstats.couch",[],[],nil,
>>>>                 {user_ctx,null,[],undefined},
>>>>                 #Ref<0.0.15.160107>,1000,
>>>>                 [before_header,after_header,on_file_open],
>>>>                 [{user_ctx,{user_ctx,null,[<<"_admin">>],undefined}}],
>>>>                 snappy,nil,nil}}]}}
>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> dustin sallings
>>> 
>>> --
>>> dustin sallings
>> 


Mime
View raw message