couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Randall Leeds (JIRA)" <j...@apache.org>
Subject [jira] Commented: (COUCHDB-823) _active_tasks dies often and doesn't accurately report replication status.
Date Sun, 11 Jul 2010 23:14:49 GMT

    [ https://issues.apache.org/jira/browse/COUCHDB-823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12887249#action_12887249
] 

Randall Leeds commented on COUCHDB-823:
---------------------------------------

A continuous replication task will show "starting..." indefinitely until a document is transferred.
If your databases are already in sync then it will stay there until the first new write. With
your replications dying often and restarting automatically it could be that you're often up
to date when the task is started.

Are your replications behind? Are there documents on the source that haven't made it to the
target even while it still says "starting..."? If so, there are two bugs: replications dying
and replications hanging. It would be useful to know which of these is happening.

Also, if you can provide any log output at the time when a replication died that might be
helpful in diagnosing the issue as well. Thanks!

> _active_tasks dies often and doesn't accurately report replication status.
> --------------------------------------------------------------------------
>
>                 Key: COUCHDB-823
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-823
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 0.11
>         Environment: 1. pull replication
> 2. password protected nginx SSL reverse-proxied source database
> 3. A replication restart watchdog have been put in use in case of replication die
> 4. we use centos-5.4 and couchdb-0.11-dist built from source
> 5. inter-datacenter connection is slow and have great latency like 80ms~150ms in rush
hour. 
> PS: the nginx configuration:
> server {
>     listen 5984;
>     server_name xxx.xxx.xxx.xxx;
>     client_max_body_size 10M;
>     ssl on;
>     ssl_certificate /usr/local/nginx/conf/cert/
> cert.pem;
>     ssl_certificate_key /usr/local/nginx/conf/cert/cert.key;
>     ssl_protocols SSLv3;
>     ssl_session_cache shared:SSL:1m;
>     location / {
>         auth_basic "Restricted";
>         auth_basic_user_file /usr/local/nginx/conf/htpasswd;
>         proxy_pass http://couchdb;
>         proxy_redirect off;
>         proxy_set_header Host $host;
>         proxy_set_header X-Real-IP $remote_addr;
>         proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
>         proxy_set_header Authorization "";
>         proxy_set_header X-Forwarded-Ssl on;
>     }
>     location ~ ^/(.*)/_changes {
>         auth_basic "Restricted";
>         auth_basic_user_file /usr/local/nginx/conf/htpasswd;
>         proxy_pass http://couchdb;
>         proxy_redirect off;
>         proxy_buffering off;
>         proxy_set_header Host $host;
>         proxy_set_header X-Real-IP $remote_addr;
>         proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
>         proxy_set_header Authorization "";
>         proxy_set_header X-Forwarded-Ssl on;
>     }
> }
>            Reporter: can xiang
>
> 1. continuous replication died often as _active_tasks shows
> 2. _active_tasks often shows "starting..." instead of "W ..." or "MR...". I have 8 databases
to replicate and I often see only two replication marked as "W.." or "MR...", others marked
as "starting..."
> when the task marked as "starting...", I monitored actual replication happened to the
relevant database. 
> I don't know if it's a bug or I have any problem with my configurations. I tried to send
a email to the user list several times, but they all got rejected as a spam(I tried with other
email too).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message