cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Jirsa <jji...@gmail.com>
Subject Re: Bootstrap OOM issues with Cassandra 3.11.1
Date Tue, 07 Aug 2018 00:57:25 GMT


Upgrading to 3.11.3 May fix it (there were some memory recycling bugs fixed recently), but
analyzing the heap will be the best option

If you can print out the heap histogram and stack trace or open a heap dump in your kit or
visualvm or MAT and show us what’s at the top of the reclaimed objects, we may be able to
figure out what’s going on

-- 
Jeff Jirsa


> On Aug 6, 2018, at 5:42 PM, Jeff Jirsa <jjirsa@gmail.com> wrote:
> 
> Are you using materialized views or secondary indices? 
> 
> -- 
> Jeff Jirsa
> 
> 
>> On Aug 6, 2018, at 3:49 PM, Laszlo Szabo <laszlo.viktor.szabo@gmail.com> wrote:
>> 
>> Hello All,
>> 
>> I'm having JVM unstable / OOM errors when attempting to auto bootstrap a 9th node
to an existing 8 node cluster (256 tokens).  Each machine has 24 cores 148GB RAM and 10TB
(2TB used).  Under normal operation the 8 nodes have JVM memory configured with Xms35G and
Xmx35G, and handle 2-4 billion inserts per day.  There are never updates, deletes, or sparsely
populated rows.  
>> 
>> For the bootstrap node, I've tried memory values from 35GB to 135GB in 10GB increments.
I've tried using both memtable_allocation_types (heap_buffers and offheap_buffers).  I've
not tried modifying the memtable_cleanup_threshold but instead have tried memtable_flush_writers
from 2 to 8.  I've tried memtable_(off)heap_space_in_mb from 20000 to 60000.  I've tried both
CMS and G1 garbage collection with various settings.  
>> 
>> Typically, after streaming about ~2TB of data, CPU load will hit a maximum, and the
"nodetool info" heap memory will, over the course of an hour, approach the maximum.  At that
point, CPU load will drop to a single thread with minimal activity until the system becomes
unstable and eventually the OOM error occurs.
>> 
>> Excerpt of the system log is below, and what I consistently see is the MemtableFlushWriter
and the MemtableReclaimMemory pending queues grow as the memory becomes depleted, but the
number of completed seems to stop changing a few minutes after the CPU load spikes.
>> 
>> One other data point is there seems to be a huge number of mutations that occur after
most of the stream has occured.  Concurrent_writes is set at 256 with the queue getting as
high as 200K before dropping down.  
>> 
>> Any suggestions for yaml changes or jvm changes?  JVM.options is currently the default
with the memory set to the max, the current YAML file is below.
>> 
>> Thanks!
>> 
>> 
>>> INFO  [ScheduledTasks:1] 2018-08-06 17:49:26,329 StatusLogger.java:51 - MutationStage
                    1         2      191498052         0                 0
>>> INFO  [ScheduledTasks:1] 2018-08-06 17:49:26,331 StatusLogger.java:51 - ViewMutationStage
                0         0              0         0                 0
>>> INFO  [Service Thread] 2018-08-06 17:49:26,338 StatusLogger.java:51 - PerDiskMemtableFlushWriter_0
        0         0           5865         0                 0
>>> INFO  [ScheduledTasks:1] 2018-08-06 17:49:26,343 StatusLogger.java:51 - ReadStage
                        0         0              0         0                 0
>>> INFO  [Service Thread] 2018-08-06 17:49:26,347 StatusLogger.java:51 - ValidationExecutor
               0         0              0         0                 0
>>> INFO  [ScheduledTasks:1] 2018-08-06 17:49:26,360 StatusLogger.java:51 - RequestResponseStage
             0         0              8         0                 0
>>> INFO  [Service Thread] 2018-08-06 17:49:26,380 StatusLogger.java:51 - Sampler
                          0         0              0         0                 0
>>> INFO  [Service Thread] 2018-08-06 17:49:26,382 StatusLogger.java:51 - MemtableFlushWriter
              8     74293           4716         0                 0
>>> INFO  [ScheduledTasks:1] 2018-08-06 17:49:26,388 StatusLogger.java:51 - ReadRepairStage
                  0         0              0         0                 0
>>> INFO  [ScheduledTasks:1] 2018-08-06 17:49:26,389 StatusLogger.java:51 - CounterMutationStage
             0         0              0         0                 0
>>> INFO  [ScheduledTasks:1] 2018-08-06 17:49:26,404 StatusLogger.java:51 - MiscStage
                        0         0              0         0                 0
>>> INFO  [ScheduledTasks:1] 2018-08-06 17:49:26,407 StatusLogger.java:51 - CompactionExecutor
               8        13            493         0                 0
>>> INFO  [Service Thread] 2018-08-06 17:49:26,410 StatusLogger.java:51 - InternalResponseStage
            0         0             16         0                 0
>>> INFO  [ScheduledTasks:1] 2018-08-06 17:49:26,413 StatusLogger.java:51 - MemtableReclaimMemory
            1      6066            356         0                 0
>>> INFO  [Service Thread] 2018-08-06 17:49:26,421 StatusLogger.java:51 - AntiEntropyStage
                 0         0              0         0                 0
>>> INFO  [Service Thread] 2018-08-06 17:49:26,430 StatusLogger.java:51 - CacheCleanupExecutor
             0         0              0         0                 0
>>> INFO  [ScheduledTasks:1] 2018-08-06 17:49:26,431 StatusLogger.java:51 - PendingRangeCalculator
           0         0              9         0                 0
>>> INFO  [Service Thread] 2018-08-06 17:49:26,436 StatusLogger.java:61 - CompactionManager
                8        19
>> 
>> 
>> 
>> 
>>  Current Yaml
>>> num_tokens: 256
>> 
>>> hinted_handoff_enabled: true
>> 
>>> hinted_handoff_throttle_in_kb: 10240 
>> 
>>> max_hints_delivery_threads: 8
>> 
>>> hints_flush_period_in_ms: 10000
>> 
>>> max_hints_file_size_in_mb: 128
>> 
>>> batchlog_replay_throttle_in_kb: 10240
>> 
>>> authenticator: AllowAllAuthenticator
>> 
>>> authorizer: AllowAllAuthorizer
>> 
>>> role_manager: CassandraRoleManager
>> 
>>> roles_validity_in_ms: 2000
>> 
>>> permissions_validity_in_ms: 2000
>> 
>>> credentials_validity_in_ms: 2000
>> 
>>> partitioner: org.apache.cassandra.dht.Murmur3Partitioner
>> 
>>> data_file_directories:
>> 
>>>     - /data/cassandra/data
>> 
>>> commitlog_directory: /data/cassandra/commitlog
>> 
>>> cdc_enabled: false
>> 
>>> disk_failure_policy: stop
>> 
>>> commit_failure_policy: stop
>> 
>>> prepared_statements_cache_size_mb:
>> 
>>> thrift_prepared_statements_cache_size_mb:
>> 
>>> key_cache_size_in_mb:
>> 
>>> key_cache_save_period: 14400
>> 
>>> row_cache_size_in_mb: 0
>> 
>>> row_cache_save_period: 0
>> 
>>> counter_cache_size_in_mb:
>> 
>>> counter_cache_save_period: 7200
>> 
>>> saved_caches_directory: /data/cassandra/saved_caches
>> 
>>> commitlog_sync: periodic
>> 
>>> commitlog_sync_period_in_ms: 10000
>> 
>>> commitlog_segment_size_in_mb: 32
>> 
>>> seed_provider:
>> 
>>>     - class_name: org.apache.cassandra.locator.SimpleSeedProvider
>> 
>>>       parameters:
>> 
>>>           - seeds: "10.1.1.11,10.1.1.12,10.1.1.13"
>> 
>>> concurrent_reads: 128
>> 
>>> concurrent_writes: 256
>> 
>>> concurrent_counter_writes: 96
>> 
>>> concurrent_materialized_view_writes: 32
>> 
>>> disk_optimization_strategy: spinning
>> 
>>> memtable_heap_space_in_mb: 61440
>> 
>>> memtable_offheap_space_in_mb: 61440
>> 
>>> memtable_allocation_type: heap_buffers
>> 
>>> commitlog_total_space_in_mb: 81920
>> 
>>> memtable_flush_writers: 8
>> 

Mime
View raw message