couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dave Cottlehuber (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (COUCHDB-1856) hang and restart when replicate remotely of database that has doc > 10M, with 600kb/s network speed
Date Wed, 24 Jul 2013 07:27:49 GMT

    [ https://issues.apache.org/jira/browse/COUCHDB-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13718072#comment-13718072
] 

Dave Cottlehuber commented on COUCHDB-1856:
-------------------------------------------

Hi [~zrao] you're running very close to the max res limit of 32 bit procs on windows -- is
only a 32 bit release for the moment.

I don't see anything in the log file related to the actual hang / crash, and I'm not sure
this issue is really a replication one for the moment.

If your docs are large, and your throughput is < 0,6 MB/sec then its likely you are timing
out in per-doc replication, on a slow/unreliable TCP link.

Can you provide more information here?

- have you altered the replication configuration at all?
- are you running any filtered replication processes at either end?
- if your replication works locally then I suspect you have an issue either with the internet,
or the remote endpoint
- more logs please, in debug mode from both ends
- any general info on your docs - JSON body size, # and length of attachments,
- specific version of erlang + couchdb  used at both ends

For the moment, I'd suggest -

1. dropping your replication concurrency down in local.ini:

[replicator]
; one worker only
worker_processes = 1
; very small batch size to decrease replicator mem usage
worker_batch_size = 50
; use a 5 minute timeout for HTTP connection 
connection_timeout = 300000
; don't retry, fail immediately
retries_per_request = 1

you can change these through futon UI and that won't require a reboot. Note these impact *all*
replications so take that into consideration.

2. more logs

- enable debug mode on both ends, assuming that's feasible
- reset the replication changes above to previous settings
- retry the replication
- disable debug mode

I'll not have time to look further until late next week likely BTW.
                
> hang and restart when replicate remotely of database that has doc > 10M, with 600kb/s
network speed
> ---------------------------------------------------------------------------------------------------
>
>                 Key: COUCHDB-1856
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1856
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 1.3, 1.3.1
>            Reporter: Zhiqing Rao
>         Attachments: couch_replicate_hang_erl_process.jpg, couch_replicate_hang.log,
couch_replicate_hang_networking.jpg
>
>
> When I remotely replicate a database that has doc > 10M, with 600kb/s network speed,
in a win7 64bit platform with couchdb 1.3.1, the couchdb server will launch the replication,
but erl.exe soon reach up to commiting > 2GB memory, then
> couchdb server hangs until a restart.
> Two things that might helpful:
> 1) It's fine for me to replicate the database in the same couchdb server (in the same
machine);  
> 2) My web browers, IE9/10, chrome, firefox, also hang or without response, at sometimes,
 when I open the documents in the database with URL; 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message