incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Stockton <chrisstockto...@gmail.com>
Subject Re: CouchDB Replication lacking resilience for many database
Date Tue, 11 Oct 2011 04:03:00 GMT
Hello,

On Mon, Oct 10, 2011 at 5:18 PM, Adam Kocoloski <kocolosk@apache.org> wrote:
> On Oct 10, 2011, at 8:02 PM, Chris Stockton wrote:
>
>> Hello,
>>
>> On Mon, Oct 10, 2011 at 4:19 PM, Filipe David Manana
>> <fdmanana@apache.org> wrote:
>>> On Tue, Oct 11, 2011 at 12:03 AM, Chris Stockton
>>> <chrisstocktonaz@gmail.com> wrote:
>>> Chris,
>>>
>>> That said work is in the'1.2.x' branch (and master).
>>> CouchDB recently migrated from SVN to GIT, see:
>>> http://couchdb.apache.org/community/code.html
>>>
>>
>> Thank you very much for the response Filipe, do you possibly have any
>> documentation or more detailed summary on what these changes include
>> and possible benefits of them? I would love to hear about any tweaking
>> or replication tips you may have for our growth issues, perhaps you
>> could answer a basic question if nothing else: Do the changes in this
>> branch minimize the performance impact of continuous replication on
>> many databases?
>>
>> Regardless I plan on getting a build of that branch and doing some
>> testing of my own very soon.
>>
>> Thank you!
>>
>> -Chris
>
> I'm pretty sure that even in 1.2.x and master each replication with a remote source still
requires one dedicated TCP connection to consume the _changes feed.  Replications with a
local source have always been able to use a connection pool per host:port combination.  That's
not to downplay the significance of the rewrite of the replicator in 1.2.x; Filipe put quite
a lot of time into it.
>
> The link to "those darn errors" just pointed to the mbox browser for September 2011.
 Do you have a more specific link?  Regards,
>
> Adam

Well I will remain optimistic that the rewrite could hopefully have
solved several of my issues regardless I hope. I guess the idle TCP
connections by themselves are not too bad, when they all start to work
simultaneously I think is what becomes the issue =)

Sorry Adam, here is a better link
http://mail-archives.apache.org/mod_mbox/couchdb-user/201109.mbox/%3CCALKFbxuugLJJY-NH46U0u584L+XDqM3NGSpeNxsJyrxosPEuCg@mail.gmail.com%3E,
the actual text was:

---------------

It seems that randomly I am getting errors about crashes as our
replicator runs, all this replicator does is make sure that all
databases on the master server replicate to our failover by checking
status.

Details:
  - I notice the below error in the logs, anywhere from 0 to 30 at a time.
  - It seems that a database might start replicating okay then stop.
  - These errors [1] are on the failover pulling from master
  - No errors are displayed on the master server
  - The databases inside the URL in the db_not_found portion of the
error, are always available from curl from the failover machine, which
makes the error strange, somehow it thinks it can't find the database
  - Master seems healthy at all times, all database are available, no
errors in log

[1] --
  [Mon, 12 Sep 2011 18:34:14 GMT] [error] [<0.22466.5305>]
{error_report,<0.30.0>,
                          {<0.22466.5305>,crash_report,
                           [[{initial_call,{couch_rep,init,['Argument__1']}},
                             {pid,<0.22466.5305>},
                             {registered_name,[]},
                             {error_info,
                              {exit,
                               {db_not_found,
                                <<"http://user:pass@server:5984/db_10944/">>},
                               [{gen_server,init_it,6},
                                {proc_lib,init_p_do_apply,3}]}},
                             {ancestors,
                              [couch_rep_sup,couch_primary_services,
                               couch_server_sup,<0.31.0>]},
                             {messages,[]},
                             {links,[<0.81.0>]},
                             {dictionary,[]},
                             {trap_exit,true},
                             {status,running},
                             {heap_size,2584},
                             {stack_size,24},
                             {reductions,794}],
                            []]}}

Mime
View raw message