cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Kjellman <mkjell...@internalcircle.com>
Subject Re: Dropped Mutation and Read messages.
Date Thu, 11 May 2017 17:56:05 GMT
This discussion should be on the C* user mailing list. Thanks!

best,
kjellman

> On May 11, 2017, at 10:53 AM, Oskar Kjellin <oskar.kjellin@gmail.com> wrote:
> 
> That seems way too low. Depending on what type of disk you have it should be closer to
1-200MB.
> That's probably causing your problems. It would still take a while for you to compact
all your data tho 
> 
> Sent from my iPhone
> 
>> On 11 May 2017, at 19:50, varun saluja <saluja50@gmail.com> wrote:
>> 
>> nodetool getcompactionthrougput
>> 
>> ./nodetool getcompactionthroughput
>> Current compaction throughput: 16 MB/s
>> 
>> Regards,
>> Varun Saluja
>> 
>>> On 11 May 2017 at 23:18, varun saluja <saluja50@gmail.com> wrote:
>>> Hi,
>>> 
>>> PFB results for same. Numbers are scary here.
>>> 
>>> [root@WA-CASSDB2 bin]# ./nodetool compactionstats
>>> pending tasks: 137
>>>   compaction type         keyspace                 table    completed       
  total    unit   progress
>>>        Compaction           system                 hints   5762711108   837522028005
  bytes      0.69%
>>>        Compaction   walletkeyspace   user_txn_history_v2    101477894     4722068388
  bytes      2.15%
>>>        Compaction   walletkeyspace   user_txn_history_v2   1511866634   753221762663
  bytes      0.20%
>>>        Compaction   walletkeyspace   user_txn_history_v2   3664734135    18605501268
  bytes     19.70%
>>> Active compaction remaining time :  26h32m28s
>>> 
>>> 
>>> 
>>>> On 11 May 2017 at 23:15, Oskar Kjellin <oskar.kjellin@gmail.com> wrote:
>>>> What does nodetool compactionstats show?
>>>> 
>>>> I meant compaction throttling. nodetool getcompactionthrougput
>>>> 
>>>> 
>>>>> On 11 May 2017, at 19:41, varun saluja <saluja50@gmail.com> wrote:
>>>>> 
>>>>> Hi Oskar,
>>>>> 
>>>>> Thanks for response.
>>>>> 
>>>>> Yes, could see lot of threads for compaction. Actually we are loading
around 400GB data  per node on 3 node cassandra cluster.
>>>>> Throttling was set to write around 7k TPS per node. Job ran fine for
2 days and then, we start getting Mutation drops  , longer GC and very high load on system.
>>>>> 
>>>>> System log reports:
>>>>> Enqueuing flush of compactions_in_progress: 1156 (0%) on-heap, 1132 (0%)
off-heap
>>>>> 
>>>>> The job was stopped 12 hours back. But, still these failures can be seen.
Can you Please let me know how shall i proceed further. If possible, Please suggest some parameters
for high write intensive jobs.
>>>>> 
>>>>> 
>>>>> Regards,
>>>>> Varun Saluja
>>>>> 
>>>>> 
>>>>>> On 11 May 2017 at 23:01, Oskar Kjellin <oskar.kjellin@gmail.com>
wrote:
>>>>>> Do you have a lot of compactions going on? It sounds like you might've
built up a huge backlog. Is your throttling configured properly?
>>>>>> 
>>>>>>> On 11 May 2017, at 18:50, varun saluja <saluja50@gmail.com>
wrote:
>>>>>>> 
>>>>>>> Hi Experts,
>>>>>>> 
>>>>>>> Seeking your help on a production issue.  We were running high
write intensive job on our 3 node cassandra cluster V 2.1.7.
>>>>>>> 
>>>>>>> TPS on nodes were high. Job ran for more than 2 days and thereafter,
loadavg on 1 of the node increased to very high number like loadavg : 29.
>>>>>>> 
>>>>>>> System log reports:
>>>>>>> 
>>>>>>> INFO  [ScheduledTasks:1] 2017-05-11 22:11:04,466 MessagingService.java:888
- 839 MUTATION messages dropped in last 5000ms
>>>>>>> INFO  [ScheduledTasks:1] 2017-05-11 22:11:04,466 MessagingService.java:888
- 2 READ messages dropped in last 5000ms
>>>>>>> INFO  [ScheduledTasks:1] 2017-05-11 22:11:04,466 MessagingService.java:888
- 1 REQUEST_RESPONSE messages dropped in last 5000ms
>>>>>>> 
>>>>>>> The job was stopped due to heavy load. But sill after 12 hours
, we can see mutation drops messages and sudden increase on avgload
>>>>>>> 
>>>>>>> Are these hintedhandoff mutations? Can we stop these.
>>>>>>> Strangely this behaviour is seen only on 2 nodes. Node 1 does
not show any load or any such activity.
>>>>>>> 
>>>>>>> Due to heavy load and GC , there are intermittent gossip failures
among node. Can you someone Please help.
>>>>>>> 
>>>>>>> PS: Load job was stopped on cluster. Everything ran fine for
few hours and and Later issue started again like mutation messages drops.
>>>>>>> 
>>>>>>> Thanks and Regards,
>>>>>>> Varun Saluja
>>>>>>> 
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>>>>>>> For additional commands, e-mail: dev-help@cassandra.apache.org
>>>>>>> 
>>>>> 
>>> 
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org


Mime
View raw message