cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From daemeon reiydelle <daeme...@gmail.com>
Subject Re: Streaming Process: How can we speed it up?
Date Mon, 12 Sep 2016 16:38:30 GMT
Re. throughput. That looks slow for jumbo with 10g. Check your networks.


*.......*



*Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872*

On Mon, Sep 12, 2016 at 8:57 AM, Vasileios Vlachos <
vasileiosvlachos@gmail.com> wrote:

> Hello,
>
> We use Nagios + NRPE, PNP4Nagios and a few templates in order to plot
> correlating counters on the same graph when needed. For the majority of our
> Cassandra-specific checks, we use the JMX console on each node.
>
> On Mon, Sep 12, 2016 at 3:59 PM, Nagh <naghrajl@gmail.com> wrote:
>
>> Hi Vasilis,
>>                     My name is Nagaraj.I'm building a new Cassandra
>> cluster in our organization.We are going to use Apache Cassandra 3.0.8.I've
>> have seen your attachments for the monitoring Cassandra.I just want to know
>> which Monitoring tool you are using for Cassandra Metrics and alerts.Do you
>> suggest anything to me.Appreciate your help on this.
>>
>> On Mon, Sep 12, 2016 at 10:38 AM, Vasileios Vlachos <
>> vasileiosvlachos@gmail.com> wrote:
>>
>>> Hello,
>>>
>>> We use cassandra 2.0.17 at the moment and we are rebuilding our nodes;
>>> this involves taking one node down at a time and bringing the new node up
>>> with JVM_OPTS="$JVM_OPTS -Dcassandra.replace_address=address_of_dead_node"
>>> in cassandra-env.sh. In order to increase the streaming times we doubled
>>> stream_throughput_outbound_megabits_per_sec from 200 to 400 on all
>>> nodes in the cluster.
>>>
>>> The problem is that streaming takes a long time to complete. On Friday I
>>> asked the IRC channel and jeffj provided some feedback, but I saw his
>>> responses hours later. I have included some graphs at the bottom of this
>>> email which show CPU performance and network utilisation on the cluster
>>> during the streaming process. Basically, jeffj's suspicion was that we are
>>> CPU-bound on the receiving node. The graphs show that CPU utilisation is
>>> not high enough for us to conclude that CPU is our bottleneck; unless
>>> during streaming, Cassandra uses one core per connection/node. Does anyone
>>> know if that's the case?
>>>
>>> INFO [main] 2016-09-12 12:34:19,800 StreamResultFuture.java (line 87)
>>> [Stream #d5708c40-78dc-11e6-b7ea-857314f4c01e] Executing streaming plan
>>> for Bootstrap
>>> INFO [main] 2016-09-12 12:34:19,800 StreamResultFuture.java (line 91)
>>> [Stream #d5708c40-78dc-11e6-b7ea-857314f4c01e] Beginning stream session
>>> with /10.3.5.2
>>> INFO [main] 2016-09-12 12:34:19,801 StreamResultFuture.java (line 91)
>>> [Stream #d5708c40-78dc-11e6-b7ea-857314f4c01e] Beginning stream session
>>> with /10.1.5.1
>>> INFO [StreamConnectionEstablisher:1] 2016-09-12 12:34:19,801
>>> StreamSession.java (line 214) [Stream #d5708c40-78dc-11e6-b7ea-857314f4c01e]
>>> Starting streaming to /10.3.5.2
>>> INFO [main] 2016-09-12 12:34:19,801 StreamResultFuture.java (line 91)
>>> [Stream #d5708c40-78dc-11e6-b7ea-857314f4c01e] Beginning stream session
>>> with /10.3.5.3
>>> INFO [main] 2016-09-12 12:34:19,806 StreamResultFuture.java (line 91)
>>> [Stream #d5708c40-78dc-11e6-b7ea-857314f4c01e] Beginning stream session
>>> with /10.1.5.2
>>> INFO [StreamConnectionEstablisher:3] 2016-09-12 12:34:19,806
>>> StreamSession.java (line 214) [Stream #d5708c40-78dc-11e6-b7ea-857314f4c01e]
>>> Starting streaming to /10.3.5.3
>>> INFO [StreamConnectionEstablisher:2] 2016-09-12 12:34:19,802
>>> StreamSession.java (line 214) [Stream #d5708c40-78dc-11e6-b7ea-857314f4c01e]
>>> Starting streaming to /10.1.5.1
>>> INFO [main] 2016-09-12 12:34:19,809 StreamResultFuture.java (line 91)
>>> [Stream #d5708c40-78dc-11e6-b7ea-857314f4c01e] Beginning stream session
>>> with /10.1.5.3
>>> INFO [StreamConnectionEstablisher:4] 2016-09-12 12:34:19,809
>>> StreamSession.java (line 214) [Stream #d5708c40-78dc-11e6-b7ea-857314f4c01e]
>>> Starting streaming to /10.1.5.2
>>> INFO [main] 2016-09-12 12:34:19,811 StreamResultFuture.java (line 91)
>>> [Stream #d5708c40-78dc-11e6-b7ea-857314f4c01e] Beginning stream session
>>> with /10.1.5.4
>>> INFO [StreamConnectionEstablisher:5] 2016-09-12 12:34:19,811
>>> StreamSession.java (line 214) [Stream #d5708c40-78dc-11e6-b7ea-857314f4c01e]
>>> Starting streaming to /10.1.5.3
>>> INFO [main] 2016-09-12 12:34:19,815 StreamResultFuture.java (line 91)
>>> [Stream #d5708c40-78dc-11e6-b7ea-857314f4c01e] Beginning stream session
>>> with /10.3.5.4
>>> INFO [StreamConnectionEstablisher:6] 2016-09-12 12:34:19,818
>>> StreamSession.java (line 214) [Stream #d5708c40-78dc-11e6-b7ea-857314f4c01e]
>>> Starting streaming to /10.1.5.4
>>> INFO [StreamConnectionEstablisher:3] 2016-09-12 12:34:19,824
>>> StreamSession.java (line 214) [Stream #d5708c40-78dc-11e6-b7ea-857314f4c01e]
>>> Starting streaming to /10.3.5.4
>>> INFO [STREAM-IN-/10.3.5.4] 2016-09-12 12:34:19,846
>>> StreamResultFuture.java (line 186) [Stream #d5708c40-78dc-11e6-b7ea-857314f4c01e]
>>> Session with /10.3.5.4 is complete
>>> INFO [STREAM-IN-/10.1.5.1] 2016-09-12 12:34:19,875
>>> StreamResultFuture.java (line 186) [Stream #d5708c40-78dc-11e6-b7ea-857314f4c01e]
>>> Session with /10.1.5.1 is complete
>>> INFO [STREAM-IN-/10.1.5.2] 2016-09-12 12:34:19,897
>>> StreamResultFuture.java (line 186) [Stream #d5708c40-78dc-11e6-b7ea-857314f4c01e]
>>> Session with /10.1.5.2 is complete
>>> INFO [STREAM-IN-/10.1.5.3] 2016-09-12 12:34:19,898
>>> StreamResultFuture.java (line 186) [Stream #d5708c40-78dc-11e6-b7ea-857314f4c01e]
>>> Session with /10.1.5.3 is complete
>>> INFO [STREAM-IN-/10.1.5.4] 2016-09-12 12:34:19,901
>>> StreamResultFuture.java (line 186) [Stream #d5708c40-78dc-11e6-b7ea-857314f4c01e]
>>> Session with /10.1.5.4 is complete
>>>
>>> The above output is from system.log during initiation of the streaming
>>> process on one of the new nodes. The 10.1.X.X nodes are located in a
>>> different DC. I understand why these nodes are not used for streaming,
>>> however, I do not understand why 10.3.5.4 is not streaming data to
>>> 10.3.5.1. Any ideas why would this happen?
>>>
>>> Looking at cassandra004's network utilisation graph, we can see that the
>>> node was streaming at 20MBps initially, then at 10MBps when only one node
>>> was sending data to it. We seem to only be able to receive data at
>>> 10MBps/Tx node. Could we do something in order to be able to stream from
>>> more nodes and/or increase the streaming speed?
>>>
>>> The graphs:
>>>
>>> [image: Inline image 15][image: Inline image 14][image: Inline image 16][image:
>>> Inline image 9][image: Inline image 13][image: Inline image 10][image:
>>> Inline image 12][image: Inline image 11]
>>>
>>> Many Thanks,
>>> Vasilis
>>>
>>> P.S.
>>>
>>> Thanks to jeffj for his help on IRC!
>>>
>>
>>
>

Mime
View raw message