incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From pob <peterob...@gmail.com>
Subject Re: writes performance
Date Sun, 20 Mar 2011 14:23:43 GMT
Hi,

I was searching for similar topic in mailing list, I think there is still
misunderstanding in measuring cluster. It will be nice if someone could
write right definitions.

What are we measuring? Ops/sec, throughput in Mbit/s?, number of
clients/threads writing/reading data?

I read Jonathan said it doesnt matter if you use CL.ONE or CL.QUORUM, but
for example writing with CL.ONE into one node of cluster with 3 nodes, RF =
3 works fine instead writing with CL.ONE into 3 nodes in parallel randomly
(stress.py -d node1,node2,node3) of  same cluster with 3 nodes, RF = 3
have consequences in nodes crashing because java out of memory.

Another thing, it was said if you use RF = N your throughput of the whole
cluster is one node throughput / 3, whats throughput in that case? Bandwith?
Ops/sec? Whats one node throughput ? One node with RF=1? Im
getting completely lost while Im trying to do some estimation about how big
stream i can write into cluster, what happens if I double nodes of cluster
and so on.


Thanks for explanation or any hints.


Best,
Peter

2011/3/20 pob <peterob333@gmail.com>

> Hello,
>
> I set up cluster with 3 nodes/ 4Gram,4cores,raid0. I did experiment with
> stress.py to see how fast my inserts are. The results are confusing.
>
> In each case stress.py was inserting 170KB of data:
> 1)
> stress.py was inserting directly to one node -dNode1, RF=3, CL.ONE
>
> 300000 inserts in 1296 sec (300000,246,246,0.01123401983,1296)
>
> 2)
> stress.py was inserting directly to one node -dNode1, RF=3, CL.QUORUM
>
> 300000 inserts in 987 sec   (300000,128,128,0.00894131883979,978)
>
> 3)
> stress.py was inserting random into all 3 nodes  -dNode1,Node2,Node3 RF=3,
> CL.QUORUM
>
> 300000 inserts in 784 sec (300000,157,157,0.00900169542641,784)
>
> 4)
> stress.py was inserting directly to one node -dNode1, RF=3, CL.ALL
>
> similar to case 1)
> -------
>
> Im not surprising about cases 2,3) but the biggest surprise for me is why
> cl.one is slower then cl.quorum. CL.one has less "acks", shorter time of
> waiting... and so on.
>
> I was looking at some blogs about "write" architecture but the reason is
> still not clear for me.
>
> http://www.mikeperham.com/2010/03/13/cassandra-internals-writing/
> http://prettyprint.me/2010/05/02/understanding-cassandra-code-base/
>
>
> Thanks for advice.
>
>
> Best,
> Peter
>
>

Mime
View raw message