incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Morton <aa...@thelastpickle.com>
Subject Re: Effect of number of keyspaces on write-throughput....
Date Mon, 19 May 2014 07:42:10 GMT
> Each client is writing to a separate keyspace simultaneously. Hence, is there a lot of
switching of keyspaces?
> 
> 
I would think not. If the client app is using one keyspace per connection there should be
no reason for the driver to change keyspaces. 

 
>  But, I observed that when using a single keyspace, the write throughout reduced slightly
to 1800pkts/sec while I actually expected it to increase since there is no switching of contexts
now. Why is this so? 
> 
> 

That’s a 5% change which is close enough to be ignored. 

I would guess that the clients are not doing anything that requires the driver to change the
keyspace for the connection. 

>              Can you also kindly explain how factors like using a single v/s multiple
keyspaces, distributing write requests to a single cassandra node v/s multiple cassandra nodes,
etc. affect the write throughput? 
> 
> 
Normally you have one keyspace per application. And the best data models are ones where the
throughput improves as the number of nodes increases. This happens when there are no “hot
spots” where every / most web requests need to read or write to a particular row. 

In general you can improve throughput by having more client threads hitting more machines.
You can expect 3,000 to 4,000 non counter writes per code per node. 

Hope that helps. 
Aaron

-----------------
Aaron Morton
New Zealand
@aaronmorton

Co-Founder & Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 13/05/2014, at 1:02 am, Krishna Chaitanya <bnsk1990rulz@gmail.com> wrote:

> Hello,
> Thanks for the reply. Currently, each client is writing about 470 packets per second
where each packet is 1500 bytes. I have four clients writing simultaneously to the cluster.
Each client is writing to a separate keyspace simultaneously. Hence, is there a lot of switching
of keyspaces?
> 
>         The total throughput is coming to around 1900 packets per second when using multiple
keyspaces. This is because there are 4 clients and each one is writing around 470 pkts/sec.
But, I observed that when using a single keyspace, the write throughout reduced slightly to
1800pkts/sec while I actually expected it to increase since there is no switching of contexts
now. Why is this so?  470 packets is the maximum I can write from each client currently, since
it is the limitation of my client program.
>                 I should also mention that these tests are being run on a single and
double node clusters with all  the write requests going only to a single cassandra server.
> 
>              Can you also kindly explain how factors like using a single v/s multiple
keyspaces, distributing write requests to a single cassandra node v/s multiple cassandra nodes,
etc. affect the write throughput?  Are there any other factors that affect write throughput
other than these?  Because, a single cassandra node seems to be able to handle all these write
requests as I am not able to see any significant improvement by distributing write requests
among multiple nodes.
> 
> Thanking you.
>                     
> 
> On May 12, 2014 2:39 PM, "Aaron Morton" <aaron@thelastpickle.com> wrote:
>> On the homepage of libQtCassandra, its mentioned that switching between keyspaces
is costly when storing into Cassandra thereby affecting the write throughput. Is this necessarily
true for other libraries like pycassa and hector as well?
>> 
>> 
> When using the thrift connection the keyspace is a part of the connection state, so changing
keyspaces requires a round trip to the server. Not hugely expensive, but it adds up if you
do it a lot. 
> 
>>                 Can I increase the write throughput by configuring all the clients
to store in a single keyspace instead of multiple keyspaces to increase the write throughput?
>> 
>> 
> You should expect to get 3,000 to 4,000 writes per core per node. 
> 
> What are you getting now?
> 
> Cheers
> A
> 
> -----------------
> Aaron Morton
> New Zealand
> @aaronmorton
> 
> Co-Founder & Principal Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com
> 
> On 11/05/2014, at 4:06 pm, Krishna Chaitanya <bnsk1990rulz@gmail.com> wrote:
> 
>> Hello,
>> I have an application that writes network packets to a Cassandra cluster from a number
of client nodes. It uses the libQtCassandra library to access Cassandra. On the homepage of
libQtCassandra, its mentioned that switching between keyspaces is costly when storing into
Cassandra thereby affecting the write throughput. Is this necessarily true for other libraries
like pycassa and hector as well?
>>                 Can I increase the write throughput by configuring all the clients
to store in a single keyspace instead of multiple keyspaces to increase the write throughput?
>> 
>> Thankyou.
>> 
> 


Mime
View raw message