incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Filipe David Manana <fdman...@apache.org>
Subject Re: Why is replication so slow?
Date Sat, 17 Jul 2010 10:18:18 GMT
On Sat, Jul 17, 2010 at 9:29 AM, Attila Nagy <bra@fsn.hu> wrote:
>  Hello,
>
> I've measured similar replication times with small documents (no
> attachments). But it's great to hear there will be progress in this area.
> Do you have any timeframe regarding this? (I'm following svn trunk for
> tests)

There's no particular time frame about it. It will replace the current
replicator as soon as it's completed and well tested.

If you want to test the current progress, look at the branch
"new_replicator" - it is based in a 1 month or so trunk snapshot. To
trigger replications for this new replicator, use the URI
/_new_replicate/ instead of /_replicate/. Note that it still lacks
features such as continuous replication, but I'm working on it.

Feel free to continue your tests and report performance measures.

cheers

>
> Thanks,
>
> On 07/16/2010 11:56 PM, Filipe David Manana wrote:
>>
>> Attila,
>>
>> That slowness is usually more visible when there are attachments of
>> moderate size (1Mb or more is enough).
>>
>> The developers are aware of it and there's new replicator code
>> currently in development which will boost the performance.
>>
>> cheers
>>
>> 2010/7/16 Attila Nagy<bra@fsn.hu>:
>>>
>>> Hi,
>>>
>>> I have three equal machines with Pentium(R) D CPU 3.20GHz, 2GiB RAM,
>>> FreeBSD
>>> 8, Erlang R13B04 (erts-5.7.5) [source] [64-bit] [smp:2:2] [rq:2]
>>> [async-threads:0] [hipe] [kernel-poll:false], and CouchDB 1.0.0.
>>>
>>> I would like to replicate documents between the three (even more machines
>>> later) in a fully meshed replica agreement (every node replicates from/to
>>> every other to ensure that there is no SPoF and every document gets to
>>> others ASAP). The nodes would store small, but quickly changing documents
>>> (application no. 1) and larger (from several kBs to several GBs) binary
>>> attachments (application no. 2). The applications are not mixed on the
>>> same
>>> CouchDB instance (even the machines).
>>>
>>> I've experimented with the first and noticed that no matter how fast
>>> insert
>>> documents (BTW, I could achieve about 230 inserts per second, parallel
>>> connections, no bulk inserts) the traffic between the machines doesn't go
>>> beyond about 500 kBps and the replicas lag behind the written node (a
>>> lot!).
>>>
>>> Based on this, I've started another test, now with smaller binary
>>> attachments. The first run did this:
>>> for i in `jot 128`
>>> do
>>> curl -X PUT http://localhost:5984/testdb/$i/file -H "Content-Type:
>>> application/octet-stream" --data-binary @bin1
>>> done
>>>
>>> That is, it uploads 128 MB of data (bin1 is 1MB of size).
>>>
>>> Without replication, it runs in 8.64 seconds (14.81 MBps, not that fast
>>> either, but hey, it's erlang :). If I run it with background curl
>>> processes
>>> (maximum 128 parallel uploads), the script runs in 6.74s (18.99 MBps).
>>>
>>> Now if I make a one way replica to another node (connected with gigabit
>>> ethernet), the run time slightly increases to 7.04s on the master node,
>>> but
>>> it takes 42 seconds (3.04 MBps) for all the 128 documents to reach the
>>> slave
>>> node.
>>> Things get worse when I make a two way replication between the two nodes,
>>> this time the upload on the "master" node takes 7.4 seconds, but 75
>>> seconds
>>> are needed for the two nodes to become consistent. The erlang processes
>>> on
>>> both sides eat more resources, so this slowdown is completely visible,
>>> not
>>> network bound (of course).
>>>
>>> If I make two one way replications (A->B, A->C node), the times look like
>>> this: time needed to upload on the master (A) node: 6.52s, time needed
>>> for
>>> the slave (B, C) nodes to get consistent with A: 44s (A->B), 39s (A->C).
>>> BTW, I calculate this from the start of the script (I'm not writing the
>>> data
>>> on A and then set up replication).
>>>
>>> With the following replications defined: A<->B, A<->C, I get these:
>>> uploading to A: 7.34s, A->B consistency: 72s , A->C consistency: 72s
>>> During the process I saw this on node A:
>>>  PID USERNAME   THR PRI NICE   SIZE    RES STATE   C   TIME   WCPU
>>> COMMAND
>>> 15427 couchdb     11 109    0   217M   149M CPU0    0  14:44 135.94%
>>> beam.smp
>>> and this was after the upload has been done, so this is what CouchDB eats
>>> when doing bilateral replication towards two nodes.
>>>
>>> And now the full mesh (A<->B, A<->C, B<->C):
>>> CouchDB resource usage tops:
>>> 15427 couchdb     11 110    0   270M   202M CPU1    1  18:44 140.14%
>>> beam.smp
>>>
>>> and the consistency times also: A->B: 125s, A->C: 107s.
>>> BTW, the upload lasted for 7.59s.
>>>
>>> Summary: it seems unilateral replication is consistent in it's resource
>>> usage, and it's pretty slow (7s on localhost write vs. 42s of replication
>>> to
>>> the remote node). If I define a bilateral replication it slows down
>>> further,
>>> nearly to the half. Every bilateral agreement introduces this slowdown,
>>> so
>>> one unilateral: 42s, one bilateral: 72s, two bilaterals: 125s.
>>>
>>> I'm sure it's not about waiting for the network or disk, it seems to be
>>> pure
>>> resource usage problem. Is this known? Will it be fixed?
>>>
>>> Thanks,
>>>
>>
>>
>
>



-- 
Filipe David Manana,
fdmanana@apache.org

"Reasonable men adapt themselves to the world.
 Unreasonable men adapt the world to themselves.
 That's why all progress depends on unreasonable men."

Mime
View raw message