cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mohamed Lrhazi <Mohamed.Lrh...@georgetown.edu>
Subject Re: Cassandra 3.2.1: Memory leak?
Date Tue, 15 Mar 2016 05:08:01 GMT
I am trying to recapture again... but my first attempt, it does not look
like these numbers vary all that much, from when the cluster reboots, till
when the nodes start crashing:

[root@avesterra-prod-1 ~]# nodetool -u cassandra -pw '......'  tablestats|
grep "Bloom filter space used:"
                Bloom filter space used: 2041877200
                Bloom filter space used: 0
                Bloom filter space used: 1936840
                Bloom filter space used: 0
                Bloom filter space used: 0
                Bloom filter space used: 0
                Bloom filter space used: 0
                Bloom filter space used: 0
                Bloom filter space used: 0
                Bloom filter space used: 0
                Bloom filter space used: 352
                Bloom filter space used: 0
                Bloom filter space used: 48
                Bloom filter space used: 0
                Bloom filter space used: 0
                Bloom filter space used: 48
                Bloom filter space used: 0
                Bloom filter space used: 0
                Bloom filter space used: 0
                Bloom filter space used: 0
                Bloom filter space used: 0
                Bloom filter space used: 0
                Bloom filter space used: 0
                Bloom filter space used: 0
                Bloom filter space used: 72
                Bloom filter space used: 720
                Bloom filter space used: 0
                Bloom filter space used: 0
                Bloom filter space used: 0
                Bloom filter space used: 32
                Bloom filter space used: 56
                Bloom filter space used: 0
                Bloom filter space used: 32
                Bloom filter space used: 32
                Bloom filter space used: 56
                Bloom filter space used: 56
                Bloom filter space used: 32
                Bloom filter space used: 32
                Bloom filter space used: 32
                Bloom filter space used: 0
                Bloom filter space used: 0
                Bloom filter space used: 0
                Bloom filter space used: 0
[root@avesterra-prod-1 ~]#





On Mon, Mar 14, 2016 at 4:43 PM, Paulo Motta <pauloricardomg@gmail.com>
wrote:

> Sorry, the command is actually nodetool tablestats and you should watch
> the bloom filter size or similar metrics.
>
> 2016-03-14 17:35 GMT-03:00 Mohamed Lrhazi <Mohamed.Lrhazi@georgetown.edu>:
>
>> Hi Paulo,
>>
>> Which metric should I watch for this ?
>>
>> [root@avesterra-prod-1 ~]# rpm -qa| grep datastax
>> datastax-ddc-3.2.1-1.noarch
>> datastax-ddc-tools-3.2.1-1.noarch
>> [root@avesterra-prod-1 ~]# cassandra -v
>> 3.2.1
>> [root@avesterra-prod-1 ~]#
>>
>> [root@avesterra-prod-1 ~]# nodetool -u cassandra -pw '########'  tpstats
>>
>>
>> Pool Name                    Active   Pending      Completed   Blocked
>>  All time blocked
>> MutationStage                     0         0          13609         0
>>               0
>> ViewMutationStage                 0         0              0         0
>>               0
>> ReadStage                         0         0              0         0
>>               0
>> RequestResponseStage              0         0              8         0
>>               0
>> ReadRepairStage                   0         0              0         0
>>               0
>> CounterMutationStage              0         0              0         0
>>               0
>> MiscStage                         0         0              0         0
>>               0
>> CompactionExecutor                1         1          17556         0
>>               0
>> MemtableReclaimMemory             0         0             38         0
>>               0
>> PendingRangeCalculator            0         0              8         0
>>               0
>> GossipStage                       0         0         118094         0
>>               0
>> SecondaryIndexManagement          0         0              0         0
>>               0
>> HintsDispatcher                   0         0              0         0
>>               0
>> MigrationStage                    0         0              0         0
>>               0
>> MemtablePostFlush                 0         0             55         0
>>               0
>> PerDiskMemtableFlushWriter_0         0         0             38         0
>>                 0
>> ValidationExecutor                0         0              0         0
>>               0
>> Sampler                           0         0              0         0
>>               0
>> MemtableFlushWriter               0         0             38         0
>>               0
>> InternalResponseStage             0         0              0         0
>>               0
>> AntiEntropyStage                  0         0              0         0
>>               0
>> CacheCleanupExecutor              0         0              0         0
>>               0
>> Native-Transport-Requests         0         0              0         0
>>               0
>>
>> Message type           Dropped
>> READ                         0
>> RANGE_SLICE                  0
>> _TRACE                       0
>> HINT                         0
>> MUTATION                     0
>> COUNTER_MUTATION             0
>> BATCH_STORE                  0
>> BATCH_REMOVE                 0
>> REQUEST_RESPONSE             0
>> PAGED_RANGE                  0
>> READ_REPAIR                  0
>> [root@avesterra-prod-1 ~]#
>>
>>
>>
>>
>> Thanks a lot,
>> Mohamed.
>>
>>
>>
>> On Mon, Mar 14, 2016 at 8:22 AM, Paulo Motta <pauloricardomg@gmail.com>
>> wrote:
>>
>>> Can you check with nodetool tpstats if bloom filter mem space
>>> utilization is very large/ramping up before the node gets killed? You could
>>> be hitting CASSANDRA-11344.
>>>
>>> 2016-03-12 19:43 GMT-03:00 Mohamed Lrhazi <Mohamed.Lrhazi@georgetown.edu
>>> >:
>>>
>>>> In my case, all nodes seem to be constantly logging messages like these:
>>>>
>>>> DEBUG [GossipStage:1] 2016-03-12 17:41:19,123 FailureDetector.java:456
>>>> - Ignoring interval time of 2000928319 for /10.212.18.170
>>>>
>>>> What does that mean?
>>>>
>>>> Thanks a lot,
>>>> Mohamed.
>>>>
>>>>
>>>> On Sat, Mar 12, 2016 at 5:39 PM, Mohamed Lrhazi <
>>>> Mohamed.Lrhazi@georgetown.edu> wrote:
>>>>
>>>>> Oh wow, similar behavior with different version all together!!
>>>>>
>>>>> On Sat, Mar 12, 2016 at 5:28 PM, ssivikt@gmail.com <ssivikt@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi, I'll duplicate here my email with the same issue
>>>>>>
>>>>>> "
>>>>>>
>>>>>>
>>>>>> *I have 7 nodes of C* v2.2.5 running on CentOS 7 and using jemalloc
>>>>>> for dynamic storage allocation. Use only one keyspace and one table
with
>>>>>> Leveled compaction strategy. I've loaded ~500 GB of data into the
cluster
>>>>>> with replication factor equals to 3 and waiting until compaction
is
>>>>>> finished. But during compaction each of the C* nodes allocates all
the
>>>>>> available memory (~128GB) and just stops its process. This is a known
bug ?
>>>>>> *"
>>>>>>
>>>>>>
>>>>>> On 03/13/2016 12:56 AM, Mohamed Lrhazi wrote:
>>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> We installed Datastax community edition, on 8 nodes, RHEL7. We
>>>>>> inserted some 7 billion rows into a pretty simple table. the inserts
seem
>>>>>> to have completed without issues. but ever since, we find that the
nodes
>>>>>> reliably run out of RAM after few hours, without any user activity
at all.
>>>>>> No reads nor write are sent at all.  What should we look for to try
and
>>>>>> identify root cause?
>>>>>>
>>>>>>
>>>>>> [root@avesterra-prod-1 ~]# cat /etc/redhat-release
>>>>>> Red Hat Enterprise Linux Server release 7.2 (Maipo)
>>>>>> [root@avesterra-prod-1 ~]# rpm -qa| grep datastax
>>>>>> datastax-ddc-3.2.1-1.noarch
>>>>>> datastax-ddc-tools-3.2.1-1.noarch
>>>>>> [root@avesterra-prod-1 ~]#
>>>>>>
>>>>>> The nodes had 8 GB RAM, which we doubled twice and now are trying
>>>>>> with 40GB... they still manage to consume it all and cause oom_killer
to
>>>>>> kick in.
>>>>>>
>>>>>> Pretty much all the settings are the default ones the installation
>>>>>> created.
>>>>>>
>>>>>> Thanks,
>>>>>> Mohamed.
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Thanks,
>>>>>> Serj
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message