couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Kocoloski <adam.kocolo...@gmail.com>
Subject Re: [jira] Updated: (COUCHDB-160) replication performance improvements
Date Wed, 12 Nov 2008 02:54:58 GMT
Hi Damien, the write queue will never be larger than 100 documents in  
this code.  I think the primary constraint isn't the number of  
documents in the database but the size of the average document.  I'm  
buffering 100 at a time without considering the size of each record,  
so a DB with lots of large attachments could run into memory problems  
quickly.

I tried to do some testing with large (>1MB) binary attachments this  
afternoon and ran into a lot of stability issues at high concurrency.   
Quite a few of the replicator's HTTP processes (both GET and POST)  
died gruesome deaths with session_remotly_closed errors.  I found this  
thread on trapexit which looked to be related:

http://www.trapexit.org/forum/viewtopic.php?p=44020

but nothing definitive.  I'll keep digging.  Best, Adam


On Nov 11, 2008, at 3:56 PM, Damien Katz wrote:

> Wow. Very cool!
>
> One thing that was a problem in the past when attempting to make  
> things more parallel was with process queues getting backed up. On  
> large replications, if the write target was slow, the readers would  
> be much faster than the writers and the write queue would get huge  
> and it could cause the erlang vm memory usage to skyrocket, slowing  
> everything else down and sometimes crashing. As a temporary fix the  
> current replicator only keeps a single doc queued up at a time.
>
> I've not closely at this yet. Is there any thing in this  
> implementation that would exhibit similar behaviors if something  
> gets behind, or the number of documents is huge?
>
> -Damien
>
>
> On Nov 11, 2008, at 1:55 PM, Adam Kocoloski (JIRA) wrote:
>
>>
>>    [ https://issues.apache.org/jira/browse/COUCHDB-160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

>>  ]
>>
>> Adam Kocoloski updated COUCHDB-160:
>> -----------------------------------
>>
>>   Attachment: couch_rep.erl.diff
>>
>> should have mentioned before -- the times quoted are in seconds.
>>
>>> replication performance improvements
>>> ------------------------------------
>>>
>>>               Key: COUCHDB-160
>>>               URL: https://issues.apache.org/jira/browse/COUCHDB-160
>>>           Project: CouchDB
>>>        Issue Type: Improvement
>>>        Components: Database Core
>>>  Affects Versions: 0.9
>>>          Reporter: Adam Kocoloski
>>>          Priority: Minor
>>>       Attachments: couch_rep.erl.diff
>>>
>>>
>>> I wrote some code to speed up CouchDB's replication process by  
>>> parallelizing document requests and using _bulk_docs to write  
>>> changes to the target.  I tested the speedup as follows:
>>> * 1000 document DB, 1022 update_seq, ~450 KB after compaction
>>> * local and remote machines have ~45 ms latency
>>> * timed requests using timer:tc(couch_rep, replicate,  
>>> [<<"source">>, <<"target">>]
>>> * all replications are "from scratch"
>>> trunk:
>>> local-local     115
>>> local-remote    145
>>> remote-remote   173
>>> remote-local    146
>>> db size after replication: 1.8 MB
>>> patch:
>>> local-local     1.83
>>> local-remote    38
>>> remote-remote   64
>>> remote-local    35
>>> db size after replication: 453 KB
>>> I'll attach the patch as an update to this issue.  It might be  
>>> worth exposing the "batch size" (currently 100 docs) as a  
>>> configurable parameter.  Comments welcome.  Best,
>>> Adam
>>
>> -- 
>> This message is automatically generated by JIRA.
>> -
>> You can reply to this email to add a comment to the issue online.
>>
>


Mime
View raw message