couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Newson <robert.new...@gmail.com>
Subject Re: replication problems
Date Wed, 10 Oct 2012 20:34:58 GMT
Jan,

Flag that as fix-for 1.3? I don't have my creds on my phone to do it.

I like the ini uuid idea best, modelled after the cookie with secret.
If we have the uuid, we'd omit host name as well as port, right?

Sent from the ocean floor

On 10 Oct 2012, at 21:12, Jan Lehnardt <jan@apache.org> wrote:

> Filipe tells me this is https://issues.apache.org/jira/browse/COUCHDB-1259
>
> Cheers
> Jan
> --
>
> On Oct 4, 2012, at 02:28 , Dustin Sallings <dustin@spy.net> wrote:
>
>>
>>    I'm bringing this back up as requested.  I'm currently simultaneously in the "not
replicating interesting things" and "has duplicate replicates state".  I think the stuff below
shows the "not replicating" stuff.
>>
>>    Active tasks shows the other (these are based on replicator DB documents (example
below):
>>
>> [
>>   {
>>       "checkpointed_source_seq": 2022317,
>>       "continuous": true,
>>       "doc_id": "cbstats-from-dogbowl",
>>       "doc_write_failures": 0,
>>       "docs_read": 300,
>>       "docs_written": 300,
>>       "missing_revisions_found": 300,
>>       "pid": "<0.10466.12>",
>>       "progress": 100,
>>       "replication_id": "50daecd0a29f4b7e5d102990831f3d64+continuous",
>>       "revisions_checked": 304,
>>       "source": "http://dustin:*****@single.couchbase.net/cbstats/",
>>       "source_seq": 2022317,
>>       "started_on": 1349309457,
>>       "target": "cbstats",
>>       "type": "replication",
>>       "updated_on": 1349310442
>>   },
>>   {
>>       "checkpointed_source_seq": 2022317,
>>       "continuous": true,
>>       "doc_id": "cbstats-from-dogbowl",
>>       "doc_write_failures": 0,
>>       "docs_read": 62,
>>       "docs_written": 62,
>>       "missing_revisions_found": 62,
>>       "pid": "<0.11019.12>",
>>       "progress": 100,
>>       "replication_id": "411e341d5aa9a3fe636cf4ea8ba71720+continuous",
>>       "revisions_checked": 304,
>>       "source": "http://dustin:*****@single.couchbase.net/cbstats/",
>>       "source_seq": 2022317,
>>       "started_on": 1349309471,
>>       "target": "cbstats",
>>       "type": "replication",
>>       "updated_on": 1349310443
>>   },
>>   {
>>       "checkpointed_source_seq": 107068,
>>       "continuous": true,
>>       "doc_id": "gerrit-from-prod",
>>       "doc_write_failures": 0,
>>       "docs_read": 22,
>>       "docs_written": 22,
>>       "missing_revisions_found": 22,
>>       "pid": "<0.11086.12>",
>>       "progress": 100,
>>       "replication_id": "4a21031dac0d81637a23c32bad620be9+continuous",
>>       "revisions_checked": 26,
>>       "source": "http://dustinphoto.iriscouch.com/gerrit/",
>>       "source_seq": 107068,
>>       "started_on": 1349309487,
>>       "target": "gerrit",
>>       "type": "replication",
>>       "updated_on": 1349310445
>>   },
>>   {
>>       "checkpointed_source_seq": 107068,
>>       "continuous": true,
>>       "doc_id": "gerrit-from-prod",
>>       "doc_write_failures": 0,
>>       "docs_read": 17,
>>       "docs_written": 17,
>>       "missing_revisions_found": 17,
>>       "pid": "<0.11107.12>",
>>       "progress": 100,
>>       "replication_id": "b4ad5d3f2e5b78670e4c8364b18000e9+continuous",
>>       "revisions_checked": 26,
>>       "source": "http://dustinphoto.iriscouch.com/gerrit/",
>>       "source_seq": 107068,
>>       "started_on": 1349309488,
>>       "target": "gerrit",
>>       "type": "replication",
>>       "updated_on": 1349310445
>>   }
>> ]
>>
>>
>>    The replicator document for the latter, for example is this:
>>
>> {
>>  "_id": "gerrit-from-prod",
>>  "_rev": "2235-36de10fb757581a1782dacbb26ee4809",
>>  "source": "http://dustinphoto.iriscouch.com/gerrit",
>>  "target": "gerrit",
>>  "continuous": true,
>>  "user_ctx": {
>>      "roles": [
>>          "_admin"
>>      ]
>>  },
>>  "_replication_state_time": "2012-10-03T17:11:27-07:00",
>>  "_replication_id": "b4ad5d3f2e5b78670e4c8364b18000e9",
>>  "_replication_state": "triggered"
>> }
>>
>>
>> Begin forwarded message:
>>
>>> From: Dustin Sallings <dustin@spy.net>
>>> Subject: Re: replication problems
>>> Date: June 15, 2012 0:10:04 PDT
>>> To: dev@couchdb.apache.org
>>> Reply-To: dev@couchdb.apache.org
>>>
>>>
>>> On Jun 14, 2012, at 11:28 PM, Benoit Chesneau wrote:
>>>
>>>> Ar you using _replicate or _replicator ? Anything interresting in logs?
>>>
>>>
>>>    I'm using _replicator (wonderful feature, I just kill the DB and everything
goes back the way I want it).
>>>
>>>    Hmm...  I do think I found some stuff digging through the logs.  This is the
local DB I noticed not doing its thing, although there were tons of errors all around this.
 Looks like the server got into some kind of bad state and sort of half-crashed.
>>>
>>>
>>> [Thu, 14 Jun 2012 23:20:12 GMT] [error] [<0.133.0>] Replication `ae601df0373da82d1b4a9ff741c8ba18+continuous`
(`rpics` -> `rpics-processed`) failed: {{timeout,{gen_server,call,[<0.213.0>,{open_ref_count,<0.4
>>> 42.0>}]}},
>>> {gen_server,call,
>>>           [couch_server,
>>>            {open,<<"rpics">>,
>>>                  [{user_ctx,{user_ctx,null,[<<"_admin">>],undefined}}]},
>>>            infinity]}}
>>> [Thu, 14 Jun 2012 23:20:25 GMT] [error] [<0.383.0>] ** Generic server <0.383.0>
terminating
>>> ** Last message in was {'EXIT',<0.384.0>,
>>>                      {{timeout,
>>>                        {gen_server,call,
>>>                         [<0.213.0>,{open_ref_count,<0.442.0>}]}},
>>>                       {gen_server,call,
>>>                        [couch_server,
>>>                         {open,<<"cbstats">>,
>>>                          [{user_ctx,
>>>                            {user_ctx,null,[<<"_admin">>],undefined}},
>>>                           {user_ctx,
>>>                            {user_ctx,null,[<<"_admin">>],undefined}}]},
>>>                         infinity]}}}
>>>
>>> ** When Server state == {state,<0.272.0>,<0.384.0>,20,
>>>                       {httpdb,
>>>                        "http://dustin:LOGGED_PASSWORD@single.couchbase.net/cbstats/",
>>>                        nil,
>>>                        [{"Accept","application/json"},
>>>                         {"User-Agent","CouchDB/1.2.0"}],
>>>                        30000,
>>>                        [{socket_options,
>>>                          [{keepalive,true},{nodelay,false}]}],
>>>                        10,250,<0.273.0>,20},
>>>                       {db,<0.288.0>,<0.289.0>,nil,<<"1339637701848579">>,
>>>                        <0.290.0>,<0.286.0>,<0.367.0>,
>>>                        {db_header,6,984356,0,
>>>                         {860345646,{737369,975,640891414},59433736},
>>>                         {860348005,738344,42056446},
>>>                         {860352635,[],5737},
>>>                         0,nil,nil,1000},
>>>                        984356,
>>>                        {btree,<0.286.0>,
>>>                         {860345646,{737369,975,640891414},59433736},
>>>                         #Fun<couch_db_updater.10.57960608>,
>>>                         #Fun<couch_db_updater.11.57960608>,
>>>                         #Fun<couch_btree.5.133731799>,
>>>                         #Fun<couch_db_updater.12.57960608>,snappy},
>>>                        {btree,<0.286.0>,
>>>                         {860348005,738344,42056446},
>>>                         #Fun<couch_db_updater.13.57960608>,
>>>                         #Fun<couch_db_updater.14.57960608>,
>>>                         #Fun<couch_btree.5.133731799>,
>>>                         #Fun<couch_db_updater.15.57960608>,snappy},
>>>                        {btree,<0.286.0>,
>>>                         {860352635,[],5737},
>>>                         #Fun<couch_btree.3.133731799>,
>>>                         #Fun<couch_btree.4.133731799>,
>>>                         #Fun<couch_btree.5.133731799>,nil,snappy},
>>>                        984356,<<"cbstats">>,
>>>                        "/Volumes/terror/db/couchdb/cbstats.couch",[],[],
>>>                        nil,
>>>                        {user_ctx,null,[<<"_admin">>],undefined},
>>>                        nil,1000,
>>>                        [before_header,after_header,on_file_open],
>>>                        [{user_ctx,
>>>                          {user_ctx,null,[<<"_admin">>],undefined}}],
>>>                        snappy,nil,nil},
>>>                       [],nil,nil,nil,
>>>                       {rep_stats,0,0,0,0,0},
>>>                       nil,<0.385.0>,
>>>                       {batch,[],0}}
>>> ** Reason for termination ==
>>> ** {noproc,{gen_server,call,[<0.367.0>,{drop,<0.383.0>},infinity]}}
>>>
>>>
>>>
>>>
>>>    Scrolling to the beginning of the errors, I find this:
>>>
>>>
>>> [Thu, 14 Jun 2012 23:15:54 GMT] [error] [<0.164.0>] Replication `543f76281e8d52d6ce5b51fddf0588e7+continuous`
(`photo` -> `http://dustin:*****@dustinphoto.couchone.com/photo/`) failed: source_db_down
>>> [Thu, 14 Jun 2012 23:18:57 GMT] [info] [<0.358.0>] 127.0.0.1 - - GET /_all_dbs
200
>>> [Thu, 14 Jun 2012 23:19:52 GMT] [error] [<0.289.0>] ** Generic server <0.289.0>
terminating
>>> ** Last message in was {update_docs,<0.272.0>,[],
>>>                         [{{doc,
>>>                               <<"_local/c4cc070f896d7267e52ba012856fed4b">>,
>>>                               {0,[<<"346185">>]},
>>>                               {[{<<"session_id">>,
>>>                                  <<"9fb3475683d44bb1e151031dd42cc59f">>},
>>>                                 {<<"source_last_seq">>,1419004},
>>>                                 {<<"replication_id_version">>,2},
>>>                                 {<<"history">>,
>>>                                  [{[{<<"session_id">>,
>>>                                      <<"9fb3475683d44bb1e151031dd42cc59f">>},
>>>                                     {<<"start_time">>,
>>>                                      <<"Thu, 14 Jun 2012 01:35:02 GMT">>},
>>>                                     {<<"end_time">>,
>>>                                      <<"Thu, 14 Jun 2012 23:15:29 GMT">>},
>>>                                     {<<"start_last_seq">>,1410146},
>>>                                     {<<"end_last_seq">>,1419004},
>>>                                     {<<"recorded_seq">>,1419004},
>>>                                     {<<"missing_checked">>,8100},
>>>                                     {<<"missing_found">>,8100},
>>>                                     {<<"docs_read">>,8100},
>>>                                     {<<"docs_written">>,8100},
>>>                                     {<<"doc_write_failures">>,0}]},
>>>                                   {[{<<"session_id">>,
>>>                                      <<"3edd7c50327eab7ec0768451e34efa8b">>},
>>>                                     {<<"start_time">>,
>>>                                      <<"Tue, 12 Jun 2012 05:51:17 GMT">>},
>>>                                     {<<"end_time">>,
>>>                                      <<"Tue, 12 Jun 2012 13:02:37 GMT">>},
>>>                                     {<<"start_last_seq">>,1407186},
>>>                                     {<<"end_last_seq">>,1410146},
>>>                                     {<<"recorded_seq">>,1410146},
>>>                                     {<<"missing_checked">>,2583},
>>>                                     {<<"missing_found">>,2577},
>>>                                     {<<"docs_read">>,2577},
>>>                                     {<<"docs_written">>,2577},
>>>                                     {<<"doc_write_failures">>,0}]},
>>>                                   {[{<<"session_id">>,
>>>                                      <<"172de62044281a01b1584a9d099f42af">>},
>>>                                     {<<"start_time">>,
>>>                                      <<"Mon, 11 Jun 2012 03:40:11 GMT">>},
>>>                                     {<<"end_time">>,
>>>                                      <<"Mon, 11 Jun 2012 15:16:24 GMT">>},
>>>                                     {<<"start_last_seq">>,1405428},
>>>                                     {<<"end_last_seq">>,1407186},
>>>                                     {<<"recorded_seq">>,1407186},
>>>                                     {<<"missing_checked">>,1721},
>>>                                     {<<"missing_found">>,1721},
>>>                                     {<<"docs_read">>,1721},
>>>                                     {<<"docs_written">>,1721},
>>>                                     {<<"doc_write_failures">>,0}]},
>>>                                   {[{<<"session_id">>,
>>>                                      <<"e60a126a2036c5fab00a1249101820c8">>},
>>>                                     {<<"start_time">>,
>>>                                      <<"Sat, 09 Jun 2012 07:47:22 GMT">>},
>>>                                     {<<"end_time">>,
>>>                                      <<"Sun, 10 Jun 2012 21:16:20 GMT">>},
>>>                                     {<<"start_last_seq">>,1386289},
>>>                                     {<<"end_last_seq">>,1405428},
>>>                                     {<<"recorded_seq">>,1405428},
>>>                                     {<<"missing_checked">>,16977},
>>>                                     {<<"missing_found">>,16977},
>>>                                     {<<"docs_read">>,16977},
>>>                                     {<<"docs_written">>,16977},
>>>                                     {<<"doc_write_failures">>,0}]},
>>>                                   {[{<<"session_id">>,
>>>                                      <<"ef3e4333d340dcf73ddfa3fe8c720042">>},
>>>                                     {<<"start_time">>,
>>>                                      <<"Mon, 04 Jun 2012 02:39:44 GMT">>},
>>>                                     {<<"end_time">>,
>>>                                      <<"Mon, 04 Jun 2012 12:35:50 GMT">>},
>>>                                     {<<"start_last_seq">>,1384738},
>>>                                     {<<"end_last_seq">>,1386289},
>>>                                     {<<"recorded_seq">>,1386289},
>>>                                     {<<"missing_checked">>,1551},
>>>                                     {<<"missing_found">>,1550},
>>>                                     {<<"docs_read">>,1550},
>>>                                     {<<"docs_written">>,1550},
>>>                                     {<<"doc_write_failures">>,0}]},
>>>                                   {[{<<"session_id">>,
>>>                                      <<"d5123a3caf462794aaf5a47be1bb3b6e">>},
>>>                                     {<<"start_time">>,
>>>                                      <<"Wed, 30 May 2012 20:41:43 GMT">>},
>>>                                     {<<"end_time">>,
>>>                                      <<"Mon, 04 Jun 2012 02:37:33 GMT">>},
>>>                                     {<<"start_last_seq">>,1372404},
>>>                                     {<<"end_last_seq">>,1384738},
>>>                                     {<<"recorded_seq">>,1384738},
>>>                                     {<<"missing_checked">>,12334},
>>>                                     {<<"missing_found">>,12333},
>>>                                     {<<"docs_read">>,12333},
>>>                                     {<<"docs_written">>,12333},
>>>                                     {<<"doc_write_failures">>,0}]},
>>>                                   {[{<<"session_id">>,
>>>                                      <<"52a16e8832f70dc094f6fff5e9b7d75b">>},
>>>                                     {<<"start_time">>,
>>>                                      <<"Sun, 27 May 2012 23:36:41 GMT">>},
>>>                                     {<<"end_time">>,
>>>                                      <<"Wed, 30 May 2012 20:40:14 GMT">>},
>>>                                     {<<"start_last_seq">>,1361049},
>>>                                     {<<"end_last_seq">>,1372404},
>>>                                     {<<"recorded_seq">>,1372404},
>>>                                     {<<"missing_checked">>,11355},
>>>                                     {<<"missing_found">>,11355},
>>>                                     {<<"docs_read">>,11355},
>>>                                     {<<"docs_written">>,11355},
>>>                                     {<<"doc_write_failures">>,0}]},
>>> [...lots of these...]
>>>
>>>                               [],false,[]},
>>>                           #Ref<0.0.15.159973>}],
>>>                         false,false}
>>> ** When Server state == {db,<0.288.0>,<0.289.0>,nil,<<"1339637701848579">>,
>>>                          <0.290.0>,<0.286.0>,<0.367.0>,
>>>                          {db_header,6,992456,0,
>>>                              {943280145,{744250,975,647546641},60017672},
>>>                              {943282327,745225,42485979},
>>>                              {943267963,[],5753},
>>>                              0,nil,nil,1000},
>>>                          992456,
>>>                          {btree,<0.286.0>,
>>>                              {943280145,{744250,975,647546641},60017672},
>>>                              #Fun<couch_db_updater.10.57960608>,
>>>                              #Fun<couch_db_updater.11.57960608>,
>>>                              #Fun<couch_btree.5.133731799>,
>>>                              #Fun<couch_db_updater.12.57960608>,snappy},
>>>                          {btree,<0.286.0>,
>>>                              {943282327,745225,42485979},
>>>                              #Fun<couch_db_updater.13.57960608>,
>>>                              #Fun<couch_db_updater.14.57960608>,
>>>                              #Fun<couch_btree.5.133731799>,
>>>                              #Fun<couch_db_updater.15.57960608>,snappy},
>>>                          {btree,<0.286.0>,
>>>                              {943267963,[],5753},
>>>                              #Fun<couch_btree.3.133731799>,
>>>                              #Fun<couch_btree.4.133731799>,
>>>                              #Fun<couch_btree.5.133731799>,nil,snappy},
>>>                          992456,<<"cbstats">>,
>>>                          "/Volumes/terror/db/couchdb/cbstats.couch",[],[],
>>>                          nil,
>>>                          {user_ctx,null,[],undefined},
>>>                          nil,1000,
>>>                          [before_header,after_header,on_file_open],
>>>                          [{user_ctx,
>>>                               {user_ctx,null,[<<"_admin">>],undefined}}],
>>>                          snappy,nil,nil}
>>> ** Reason for termination ==
>>> ** {timeout,
>>>     {gen_server,call,
>>>         [<0.288.0>,
>>>          {db_updated,
>>>              {db,<0.288.0>,<0.289.0>,nil,<<"1339637701848579">>,<0.290.0>,
>>>                  <0.286.0>,<0.367.0>,
>>>                  {db_header,6,992456,0,
>>>                      {943280145,{744250,975,647546641},60017672},
>>>                      {943282327,745225,42485979},
>>>                      {943267963,[],5753},
>>>                      0,nil,nil,1000},
>>>                  992456,
>>>                  {btree,<0.286.0>,
>>>                      {943280145,{744250,975,647546641},60017672},
>>>                      #Fun<couch_db_updater.10.57960608>,
>>>                      #Fun<couch_db_updater.11.57960608>,
>>>                      #Fun<couch_btree.5.133731799>,
>>>                      #Fun<couch_db_updater.12.57960608>,snappy},
>>>                  {btree,<0.286.0>,
>>>                      {943282327,745225,42485979},
>>>                      #Fun<couch_db_updater.13.57960608>,
>>>                      #Fun<couch_db_updater.14.57960608>,
>>>                      #Fun<couch_btree.5.133731799>,
>>>                      #Fun<couch_db_updater.15.57960608>,snappy},
>>>                  {btree,<0.286.0>,
>>>                      {943284347,[],5756},
>>>                      #Fun<couch_btree.3.133731799>,
>>>                      #Fun<couch_btree.4.133731799>,
>>>                      #Fun<couch_btree.5.133731799>,nil,snappy},
>>>                  992456,<<"cbstats">>,
>>>                  "/Volumes/terror/db/couchdb/cbstats.couch",[],[],nil,
>>>                  {user_ctx,null,[],undefined},
>>>                  #Ref<0.0.15.160107>,1000,
>>>                  [before_header,after_header,on_file_open],
>>>                  [{user_ctx,{user_ctx,null,[<<"_admin">>],undefined}}],
>>>                  snappy,nil,nil}}]}}
>>>
>>>
>>>
>>>
>>> --
>>> dustin sallings
>>
>> --
>> dustin sallings
>

Mime
View raw message