cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jens Rantil <jens.ran...@tink.se>
Subject Re: Streaming Process: How can we speed it up?
Date Fri, 16 Sep 2016 07:47:38 GMT
Hi Vasilis,

Have you considered setting up a new DC[1], migrating over your clients and
decommissioning the old cluster instead? Some advantages:

   - It involves less hackery and workarounds. It makes mistakes less
   likely.
   - You can stream all data to the new DC concurrently all nodes at the
   same time. This is instead of doing a single node at a time like you are
   doing.
   - You have more of a point-in-time migration from old DC to new. You can
   easily migrate back to the old DC in case something goes wrong.

AFAIK, the reasons you can't do above is if you don't have enough hardware,
or not enough IP addresses. Otherwise, I'd say the above process is
somewhat of a best practise.

[1]
https://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_add_dc_to_cluster_t.html

Cheers,
Jens

On Mon, Sep 12, 2016 at 4:39 PM Vasileios Vlachos <
vasileiosvlachos@gmail.com> wrote:

> Hello,
>
> We use cassandra 2.0.17 at the moment and we are rebuilding our nodes;
> this involves taking one node down at a time and bringing the new node up
> with JVM_OPTS="$JVM_OPTS -Dcassandra.replace_address=address_of_dead_node"
> in cassandra-env.sh. In order to increase the streaming times we doubled
> stream_throughput_outbound_megabits_per_sec from 200 to 400 on all nodes in
> the cluster.
>
> The problem is that streaming takes a long time to complete. On Friday I
> asked the IRC channel and jeffj provided some feedback, but I saw his
> responses hours later. I have included some graphs at the bottom of this
> email which show CPU performance and network utilisation on the cluster
> during the streaming process. Basically, jeffj's suspicion was that we are
> CPU-bound on the receiving node. The graphs show that CPU utilisation is
> not high enough for us to conclude that CPU is our bottleneck; unless
> during streaming, Cassandra uses one core per connection/node. Does anyone
> know if that's the case?
>
> INFO [main] 2016-09-12 12:34:19,800 StreamResultFuture.java (line 87)
> [Stream #d5708c40-78dc-11e6-b7ea-857314f4c01e] Executing streaming plan for
> Bootstrap
> INFO [main] 2016-09-12 12:34:19,800 StreamResultFuture.java (line 91)
> [Stream #d5708c40-78dc-11e6-b7ea-857314f4c01e] Beginning stream session
> with /10.3.5.2
> INFO [main] 2016-09-12 12:34:19,801 StreamResultFuture.java (line 91)
> [Stream #d5708c40-78dc-11e6-b7ea-857314f4c01e] Beginning stream session
> with /10.1.5.1
> INFO [StreamConnectionEstablisher:1] 2016-09-12 12:34:19,801
> StreamSession.java (line 214) [Stream
> #d5708c40-78dc-11e6-b7ea-857314f4c01e] Starting streaming to /10.3.5.2
> INFO [main] 2016-09-12 12:34:19,801 StreamResultFuture.java (line 91)
> [Stream #d5708c40-78dc-11e6-b7ea-857314f4c01e] Beginning stream session
> with /10.3.5.3
> INFO [main] 2016-09-12 12:34:19,806 StreamResultFuture.java (line 91)
> [Stream #d5708c40-78dc-11e6-b7ea-857314f4c01e] Beginning stream session
> with /10.1.5.2
> INFO [StreamConnectionEstablisher:3] 2016-09-12 12:34:19,806
> StreamSession.java (line 214) [Stream
> #d5708c40-78dc-11e6-b7ea-857314f4c01e] Starting streaming to /10.3.5.3
> INFO [StreamConnectionEstablisher:2] 2016-09-12 12:34:19,802
> StreamSession.java (line 214) [Stream
> #d5708c40-78dc-11e6-b7ea-857314f4c01e] Starting streaming to /10.1.5.1
> INFO [main] 2016-09-12 12:34:19,809 StreamResultFuture.java (line 91)
> [Stream #d5708c40-78dc-11e6-b7ea-857314f4c01e] Beginning stream session
> with /10.1.5.3
> INFO [StreamConnectionEstablisher:4] 2016-09-12 12:34:19,809
> StreamSession.java (line 214) [Stream
> #d5708c40-78dc-11e6-b7ea-857314f4c01e] Starting streaming to /10.1.5.2
> INFO [main] 2016-09-12 12:34:19,811 StreamResultFuture.java (line 91)
> [Stream #d5708c40-78dc-11e6-b7ea-857314f4c01e] Beginning stream session
> with /10.1.5.4
> INFO [StreamConnectionEstablisher:5] 2016-09-12 12:34:19,811
> StreamSession.java (line 214) [Stream
> #d5708c40-78dc-11e6-b7ea-857314f4c01e] Starting streaming to /10.1.5.3
> INFO [main] 2016-09-12 12:34:19,815 StreamResultFuture.java (line 91)
> [Stream #d5708c40-78dc-11e6-b7ea-857314f4c01e] Beginning stream session
> with /10.3.5.4
> INFO [StreamConnectionEstablisher:6] 2016-09-12 12:34:19,818
> StreamSession.java (line 214) [Stream
> #d5708c40-78dc-11e6-b7ea-857314f4c01e] Starting streaming to /10.1.5.4
> INFO [StreamConnectionEstablisher:3] 2016-09-12 12:34:19,824
> StreamSession.java (line 214) [Stream
> #d5708c40-78dc-11e6-b7ea-857314f4c01e] Starting streaming to /10.3.5.4
> INFO [STREAM-IN-/10.3.5.4] 2016-09-12 12:34:19,846
> StreamResultFuture.java (line 186) [Stream
> #d5708c40-78dc-11e6-b7ea-857314f4c01e] Session with /10.3.5.4 is complete
>
> INFO [STREAM-IN-/10.1.5.1] 2016-09-12 12:34:19,875
> StreamResultFuture.java (line 186) [Stream
> #d5708c40-78dc-11e6-b7ea-857314f4c01e] Session with /10.1.5.1 is complete
>
> INFO [STREAM-IN-/10.1.5.2] 2016-09-12 12:34:19,897
> StreamResultFuture.java (line 186) [Stream
> #d5708c40-78dc-11e6-b7ea-857314f4c01e] Session with /10.1.5.2 is complete
>
> INFO [STREAM-IN-/10.1.5.3] 2016-09-12 12:34:19,898
> StreamResultFuture.java (line 186) [Stream
> #d5708c40-78dc-11e6-b7ea-857314f4c01e] Session with /10.1.5.3 is complete
>
> INFO [STREAM-IN-/10.1.5.4] 2016-09-12 12:34:19,901
> StreamResultFuture.java (line 186) [Stream
> #d5708c40-78dc-11e6-b7ea-857314f4c01e] Session with /10.1.5.4 is complete
>
> The above output is from system.log during initiation of the streaming
> process on one of the new nodes. The 10.1.X.X nodes are located in a
> different DC. I understand why these nodes are not used for streaming,
> however, I do not understand why 10.3.5.4 is not streaming data to
> 10.3.5.1. Any ideas why would this happen?
>
> Looking at cassandra004's network utilisation graph, we can see that the
> node was streaming at 20MBps initially, then at 10MBps when only one node
> was sending data to it. We seem to only be able to receive data at
> 10MBps/Tx node. Could we do something in order to be able to stream from
> more nodes and/or increase the streaming speed?
>
> The graphs:
>
> [image: cassandra001_CPU.png][image: cassandra001_network.png][image:
> cassandra002_CPU.png][image: cassandra002_network.png][image:
> cassandra003_CPU.png][image: cassandra003_network.png][image:
> cassandra004_CPU.png][image: cassandra004_network.png]
>
> Many Thanks,
> Vasilis
>
> P.S.
>
> Thanks to jeffj for his help on IRC!
>
-- 

Jens Rantil
Backend Developer @ Tink

Tink AB, Wallingatan 5, 111 60 Stockholm, Sweden
For urgent matters you can reach me at +46-708-84 18 32.

Mime
View raw message