spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Noorul Islam Kamal Malmiyoda <noo...@noorul.com>
Subject Cassandra read throughput using DataStax connector in Spark
Date Sat, 26 Dec 2015 15:37:57 GMT
Hello all,

I am using DataStax connector to read data from Cassandra and write to
another Cassandra cluster.  Infra is Amazon. I have three nodes
cluster with replication factor of 3 on both clusters.

But the throughput seems to be very low. It takes 7 minutes to
transfer around 2.5 GB/node. I think the bottleneck is at the read
side as I could see that spark node (Independent of two clusters) is
less loaded with respect to memory and CPU.

I tried tweaking some from
https://github.com/datastax/spark-cassandra-connector/blob/master/doc/reference.md#cassandra-connection-parameters

Do you have any idea whether there is any parameter that I can tweak
to get better throughput?

Regards,
Noorul

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message