cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sergio Bilello <lapostadiser...@gmail.com>
Subject Cassandra 3.11.4 Node the load starts to increase after few minutes to 40 on 4 CPU machine
Date Thu, 17 Oct 2019 02:46:01 GMT
Hello guys!

I performed a thread dump https://fastthread.io/my-thread-report.jsp?p=c2hhcmVkLzIwMTkvMTAvMTcvLS1kdW1wLnR4dC0tMC0zMC00MA==&
while try to join the node with

-Dcassandra.join_ring=false

OR
-Dcassandra.join.ring=false

OR

-Djoin.ring=false

because the node spiked in load and latency was affecting the clients.

With or without that flag the node is high in latency and I see the load sky rocketing when
the number of TCP established connections increases

Analyzing the /var/log/messages I am able to read

Oct 17 00:23:39 prod-personalization-live-data-cassandra-08 cassandra: INFO [Service Thread]
2019-10-17 00:23:39,030 GCInspector.java:284 - G1 Young Generation GC in 255ms. G1 Eden Space:
361758720 -> 0; G1 Old Gen: 1855455944 -> 1781007048; G1 Survivor Space: 39845888 ->
32505856;

Oct 17 00:23:40 prod-personalization-live-data-cassandra-08 cassandra: INFO [ScheduledTasks:1]
2019-10-17 00:23:40,352 NoSpamLogger.java:91 - Some operations were slow, details available
at debug level (debug.log)


Oct 17 00:23:03 prod-personalization-live-data-cassandra-08 kernel: TCP: request_sock_TCP:
Possible SYN flooding on port 9042. Sending cookies. Check SNMP counters.

I don't see anything on debug.log that looks to be relevant

The machine is on aws with 4 cpu with 32GB Ram and 1 TB SSD i3.xlarge





[sergio.bilello@prod-personalization-live-data-cassandra-08 ~]$ nodetool tpstats

Pool Name Active Pending Completed Blocked All time blocked

ReadStage 32 53 559304 0 0

MiscStage 0 0 0 0 0

CompactionExecutor 1 107 118 0 0

MutationStage 0 0 2695 0 0

MemtableReclaimMemory 0 0 11 0 0

PendingRangeCalculator 0 0 33 0 0

GossipStage 0 0 4314 0 0

SecondaryIndexManagement 0 0 0 0 0

HintsDispatcher 0 0 0 0 0

RequestResponseStage 0 0 421865 0 0

Native-Transport-Requests 22 0 1903400 0 0

ReadRepairStage 0 0 59078 0 0

CounterMutationStage 0 0 0 0 0

MigrationStage 0 0 0 0 0

MemtablePostFlush 0 0 32 0 0

PerDiskMemtableFlushWriter_0 0 0 11 0 0

ValidationExecutor 0 0 0 0 0

Sampler 0 0 0 0 0

MemtableFlushWriter 0 0 11 0 0

InternalResponseStage 0 0 0 0 0

ViewMutationStage 0 0 0 0 0

AntiEntropyStage 0 0 0 0 0

CacheCleanupExecutor 0 0 0 0 0



Message type Dropped

READ 0

RANGE_SLICE 0

_TRACE 0

HINT 0

MUTATION 0

COUNTER_MUTATION 0

BATCH_STORE 0

BATCH_REMOVE 0

REQUEST_RESPONSE 0

PAGED_RANGE 0

READ_REPAIR 0

[sergio.bilello@prod-personalization-live-data-cassandra-08 ~]$





top - 01:44:15 up 2 days, 1:45, 4 users, load average: 34.45, 27.71, 15.37

Tasks: 140 total, 1 running, 74 sleeping, 0 stopped, 0 zombie

%Cpu(s): 90.0 us, 4.5 sy, 3.0 ni, 1.1 id, 0.0 wa, 0.0 hi, 1.4 si, 0.0 st

KiB Mem : 31391772 total, 250504 free, 10880364 used, 20260904 buff/cache

KiB Swap: 0 total, 0 free, 0 used. 19341960 avail Mem



PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

20712 cassand+ 20 0 194.1g 14.4g 4.6g S 392.0 48.2 74:50.48 java

20823 sergio.+ 20 0 124856 6304 3136 S 1.7 0.0 0:13.51 htop

7865 root 20 0 1062684 39880 11428 S 0.7 0.1 4:06.02 ir_agent

3557 consul 20 0 41568 30192 18832 S 0.3 0.1 13:16.37 consul

7600 root 20 0 2082700 46624 11880 S 0.3 0.1 4:14.60 ir_agent

1 root 20 0 193660 7740 5220 S 0.0 0.0 0:56.36 systemd

2 root 20 0 0 0 0 S 0.0 0.0 0:00.08 kthreadd

4 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kworker/0:0H

6 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 mm_percpu_wq

7 root 20 0 0 0 0 S 0.0 0.0 0:06.04 ksoftirqd/0





[sergio.bilello@prod-personalization-live-data-cassandra-08 ~]$ free

total used free shared buff/cache available

Mem: 31391772 10880916 256732 426552 20254124 19341768

Swap: 0 0 0

[sergio.bilello@prod-personalization-live-data-cassandra-08 ~]$







bash-4.2$ java -jar sjk.jar ttop -p 20712

Monitoring threads ...



2019-10-17T01:45:33.352+0000 Process summary

process cpu=363.58%

application cpu=261.91% (user=248.65% sys=13.26%)

other: cpu=101.67%

thread count: 474

heap allocation rate 583mb/s

[000039] user=13.56% sys=-0.59% alloc= 11mb/s - OptionalTasks:1

[000379] user= 8.57% sys=-0.27% alloc= 18mb/s - ReadStage-19

[000380] user= 7.85% sys= 0.22% alloc= 19mb/s - Native-Transport-Requests-21

[000295] user= 7.14% sys= 0.23% alloc= 14mb/s - Native-Transport-Requests-5

[000378] user= 7.14% sys=-0.03% alloc= 22mb/s - Native-Transport-Requests-17

[000514] user= 6.42% sys= 0.12% alloc= 20mb/s - Native-Transport-Requests-85

[000293] user= 6.66% sys=-0.32% alloc= 12mb/s - Native-Transport-Requests-2

[000392] user= 6.19% sys= 0.14% alloc= 9545kb/s - Native-Transport-Requests-12

[000492] user= 5.71% sys=-0.24% alloc= 15mb/s - Native-Transport-Requests-24

[000294] user= 5.23% sys=-0.25% alloc= 14mb/s - Native-Transport-Requests-3

[000381] user= 5.47% sys=-0.52% alloc= 7430kb/s - Native-Transport-Requests-23

[000672] user= 4.52% sys= 0.25% alloc= 14mb/s - Native-Transport-Requests-270

[000296] user= 5.23% sys=-0.47% alloc= 13mb/s - ReadStage-7

[000673] user= 4.52% sys= 0.05% alloc= 13mb/s - Native-Transport-Requests-269

[000118] user= 4.28% sys= 0.28% alloc= 13mb/s - Native-Transport-Requests-1

[000570] user= 4.28% sys= 0.27% alloc= 13mb/s - ReadStage-31

[000671] user= 4.28% sys= 0.25% alloc= 13mb/s - ReadStage-271

[000376] user= 4.52% sys=-0.03% alloc= 13mb/s - Native-Transport-Requests-13

[000302] user= 4.76% sys=-0.30% alloc= 13mb/s - ReadStage-8

[000377] user= 4.28% sys= 0.17% alloc= 13mb/s - ReadStage-15





^Cbash-4.2$ dstat

You did not select any stats, using -cdngy by default.

----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--

usr sys idl wai hiq siq| read writ| recv send| in out | int csw

33 2 65 0 0 0| 443k 7988k| 0 0 | 0 0 |3255 2324

92 6 1 0 0 2|4732k 0 |1437k 1039k| 0 0 | 20k 24k

94 4 1 0 0 1|2588k 104k|1371k 545k| 0 0 | 13k 14k

91 6 1 0 0 2|1548k 0 |2045k 949k| 0 0 | 17k 20k

94 4 1 0 0 2|1120k 0 |1364k 372k| 0 0 | 12k 11k

95 4 0 0 0 1|3616k 0 |2137k 333k| 0 0 | 12k 7875

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org


Mime
View raw message