cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Shuler <mich...@pbandjelly.org>
Subject Re: Node always dieing
Date Thu, 06 Apr 2017 14:14:34 GMT
All it takes is one frustrated `sudo cassandra` run. Checking only the
top level directory ownership is insufficient, since root could own
files/dirs created below the top level. Find all files not owned by user
cassandra:  `find /mnt/cassandra/ \! -user cassandra`

Just another thought.

-- 
Michael


On 04/06/2017 05:23 AM, Cogumelos Maravilha wrote:
> From cassandra.yaml:
> 
> hints_directory: /mnt/cassandra/hints
> data_file_directories:
>     - /mnt/cassandra/data
> commitlog_directory: /mnt/cassandra/commitlog
> saved_caches_directory: /mnt/cassandra/saved_caches
> 
> drwxr-xr-x   3 cassandra cassandra   23 Apr  5 16:03 mnt/
> 
> drwxr-xr-x 6 cassandra cassandra  68 Apr  5 16:17 ./
> drwxr-xr-x 3 cassandra cassandra  23 Apr  5 16:03 ../
> drwxr-xr-x 2 cassandra cassandra  80 Apr  6 10:07 commitlog/
> drwxr-xr-x 8 cassandra cassandra 124 Apr  5 16:17 data/
> drwxr-xr-x 2 cassandra cassandra  72 Apr  5 16:20 hints/
> drwxr-xr-x 2 cassandra cassandra  49 Apr  5 20:17 saved_caches/
> 
> cassand+  2267     1 99 10:18 ?        00:02:56 java
> -Xloggc:/var/log/cassandra/gc.log -ea -XX:+UseThreadPriorities -XX:Threa...
> 
> /dev/mapper/um_vg-xfs_lv  885G   27G  858G   4% /mnt
> 
> On /etc/security/limits.conf
> 
> *           -       memlock      unlimited
> *           -      nofile      100000
> *           -      nproc          32768
> *           -      as           unlimited
> 
> On /etc/security/limits.d/cassandra.conf
> 
> cassandra  -  memlock  unlimited
> cassandra  -  nofile   100000
> cassandra  -  as       unlimited
> cassandra  -  nproc    32768
> 
> On /etc/sysctl.conf
> 
> vm.max_map_count = 1048575
> 
> On /etc/systcl.d/cassanda.conf
> 
> vm.max_map_count = 1048575
> net.ipv4.tcp_keepalive_time=600
> 
> On /etc/pam.d/su
> ...
> session    required   pam_limits.so
> ...
> 
> Distro is the currently Ubuntu LTS.
> Thanks
> 
> 
> On 04/06/2017 10:39 AM, benjamin roth wrote:
>> Cassandra cannot write an SSTable to disk. Are you sure the
>> disk/volume where SSTables reside (normally /var/lib/cassandra/data)
>> is writeable for the CS user and has enough free space?
>> The CDC warning also implies that.
>> The other warnings indicate you are probably not running CS as root
>> and you did not set an appropriate limit for max open files. Running
>> out of open files can also be a reason for the IO error.
>>
>> 2017-04-06 11:34 GMT+02:00 Cogumelos Maravilha
>> <cogumelosmaravilha@sapo.pt <mailto:cogumelosmaravilha@sapo.pt>>:
>>
>>     Hi list,
>>
>>     I'm using C* 3.10 in a 6 nodes cluster RF=2. All instances type
>>     i3.xlarge (AWS) with 32GB, 2 cores and SSD LVM XFS formated 885G.
>>     I have
>>     one node that is always dieing and I don't understand why. Can anyone
>>     give me some hints please. All nodes using the same configuration.
>>
>>     Thanks in advance.
>>
>>     INFO  [IndexSummaryManager:1] 2017-04-06 05:22:18,352
>>     IndexSummaryRedistribution.java:75 - Redistributing index summaries
>>     ERROR [MemtablePostFlush:22] 2017-04-06 06:00:26,800
>>     CassandraDaemon.java:229 - Exception in thread
>>     Thread[MemtablePostFlush:22,5,main]
>>     org.apache.cassandra.io
>>     <http://org.apache.cassandra.io>.FSWriteError:
>>     java.io.IOException: Input/output
>>     error
>>         at
>>     org.apache.cassandra.io.util.SequentialWriter.syncDataOnlyInternal(SequentialWriter.java:173)
>>     ~[apache-cassandra-3.10.jar:3.10]
>>         at
>>     org.apache.cassandra.io.util.SequentialWriter.syncInternal(SequentialWriter.java:185)
>>     ~[apache-cassandra-3.10.jar:3.10]
>>         at
>>     org.apache.cassandra.io
>>     <http://org.apache.cassandra.io>.compress.CompressedSequentialWriter.access$100(CompressedSequentialWriter.java:38)
>>     ~[apache-cassandra-3.10.jar:3.10]
>>         at
>>     org.apache.cassandra.io
>>     <http://org.apache.cassandra.io>.compress.CompressedSequentialWriter$TransactionalProxy.doPrepare(CompressedSequentialWriter.java:307)
>>     ~[apache-cassandra-3.10.jar:3.10]
>>         at
>>     org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(Transactional.java:173)
>>     ~[apache-cassandra-3.10.jar:3.10]
>>         at
>>     org.apache.cassandra.io.util.SequentialWriter.prepareToCommit(SequentialWriter.java:358)
>>     ~[apache-cassandra-3.10.jar:3.10]
>>         at
>>     org.apache.cassandra.io
>>     <http://org.apache.cassandra.io>.sstable.format.big.BigTableWriter$TransactionalProxy.doPrepare(BigTableWriter.java:367)
>>     ~[apache-cassandra-3.10.jar:3.10]
>>         at
>>     org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(Transactional.java:173)
>>     ~[apache-cassandra-3.10.jar:3.10]
>>         at
>>     org.apache.cassandra.io
>>     <http://org.apache.cassandra.io>.sstable.format.SSTableWriter.prepareToCommit(SSTableWriter.java:281)
>>     ~[apache-cassandra-3.10.jar:3.10]
>>         at
>>     org.apache.cassandra.io
>>     <http://org.apache.cassandra.io>.sstable.SimpleSSTableMultiWriter.prepareToCommit(SimpleSSTableMultiWriter.java:101)
>>     ~[apache-cassandra-3.10.jar:3.10]
>>         at
>>     org.apache.cassandra.db.ColumnFamilyStore$Flush.flushMemtable(ColumnFamilyStore.java:1153)
>>     ~[apache-cassandra-3.10.jar:3.10]
>>         at
>>     org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1086)
>>     ~[apache-cassandra-3.10.jar:3.10]
>>         at
>>     java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>     ~[na:1.8.0_121]
>>         at
>>     java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>     [na:1.8.0_121]
>>         at
>>     org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
>>     [apache-cassandra-3.10.jar:3.10]
>>         at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121]
>>     Caused by: java.io.IOException: Input/output error
>>         at sun.nio.ch.FileDispatcherImpl.force0(Native Method)
>>     ~[na:1.8.0_121]
>>         at sun.nio.ch.FileDispatcherImpl.force(FileDispatcherImpl.java:76)
>>     ~[na:1.8.0_121]
>>         at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:388)
>>     ~[na:1.8.0_121]
>>         at org.apache.cassandra.utils.SyncUtil.force(SyncUtil.java:158)
>>     ~[apache-cassandra-3.10.jar:3.10]
>>         at
>>     org.apache.cassandra.io.util.SequentialWriter.syncDataOnlyInternal(SequentialWriter.java:169)
>>     ~[apache-cassandra-3.10.jar:3.10]
>>         ... 15 common frames omitted
>>     INFO  [IndexSummaryManager:1] 2017-04-06 06:22:18,366
>>     IndexSummaryRedistribution.java:75 - Redistributing index summaries
>>     ERROR [MemtablePostFlush:31] 2017-04-06 06:39:19,525
>>     CassandraDaemon.java:229 - Exception in thread
>>     Thread[MemtablePostFlush:31,5,main]
>>     org.apache.cassandra.io
>>     <http://org.apache.cassandra.io>.FSWriteError:
>>     java.io.IOException: Input/output
>>     error
>>         at
>>     org.apache.cassandra.io.util.SequentialWriter.syncDataOnlyInternal(SequentialWriter.java:173)
>>     ~[apache-cassandra-3.10.jar:3.10]
>>         at
>>     org.apache.cassandra.io.util.SequentialWriter.syncInternal(SequentialWriter.java:185)
>>     ~[apache-cassandra-3.10.jar:3.10]
>>         at
>>     org.apache.cassandra.io
>>     <http://org.apache.cassandra.io>.compress.CompressedSequentialWriter.access$100(CompressedSequentialWriter.java:38)
>>     ~[apache-cassandra-3.10.jar:3.10]
>>         at
>>     org.apache.cassandra.io
>>     <http://org.apache.cassandra.io>.compress.CompressedSequentialWriter$TransactionalProxy.doPrepare(CompressedSequentialWriter.java:307)
>>     ~[apache-cassandra-3.10.jar:3.10]
>>         at
>>     org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(Transactional.java:173)
>>     ~[apache-cassandra-3.10.jar:3.10]
>>         at
>>     org.apache.cassandra.io.util.SequentialWriter.prepareToCommit(SequentialWriter.java:358)
>>     ~[apache-cassandra-3.10.jar:3.10]
>>         at
>>     org.apache.cassandra.io
>>     <http://org.apache.cassandra.io>.sstable.format.big.BigTableWriter$TransactionalProxy.doPrepare(BigTableWriter.java:367)
>>     ~[apache-cassandra-3.10.jar:3.10]
>>         at
>>     org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(Transactional.java:173)
>>     ~[apache-cassandra-3.10.jar:3.10]
>>         at
>>     org.apache.cassandra.io
>>     <http://org.apache.cassandra.io>.sstable.format.SSTableWriter.prepareToCommit(SSTableWriter.java:281)
>>     ~[apache-cassandra-3.10.jar:3.10]
>>         at
>>     org.apache.cassandra.io
>>     <http://org.apache.cassandra.io>.sstable.SimpleSSTableMultiWriter.prepareToCommit(SimpleSSTableMultiWriter.java:101)
>>     ~[apache-cassandra-3.10.jar:3.10]
>>         at
>>     org.apache.cassandra.db.ColumnFamilyStore$Flush.flushMemtable(ColumnFamilyStore.java:1153)
>>     ~[apache-cassandra-3.10.jar:3.10]
>>         at
>>     org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1086)
>>     ~[apache-cassandra-3.10.jar:3.10]
>>         at
>>     java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>     ~[na:1.8.0_121]
>>         at
>>     java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>     [na:1.8.0_121]
>>         at
>>     org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
>>     [apache-cassandra-3.10.jar:3.10]
>>         at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121]
>>     Caused by: java.io.IOException: Input/output error
>>         at sun.nio.ch.FileDispatcherImpl.force0(Native Method)
>>     ~[na:1.8.0_121]
>>         at sun.nio.ch.FileDispatcherImpl.force(FileDispatcherImpl.java:76)
>>     ~[na:1.8.0_121]
>>         at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:388)
>>     ~[na:1.8.0_121]
>>         at org.apache.cassandra.utils.SyncUtil.force(SyncUtil.java:158)
>>     ~[apache-cassandra-3.10.jar:3.10]
>>         at
>>     org.apache.cassandra.io.util.SequentialWriter.syncDataOnlyInternal(SequentialWriter.java:169)
>>     ~[apache-cassandra-3.10.jar:3.10]
>>         ... 15 common frames omitted
>>     INFO  [main] 2017-04-06 07:11:57,289 YamlConfigurationLoader.java:89 -
>>     Configuration location: file:/etc/cassandra/cassandra.yaml
>>
>>
>>     Some ERRORs messages:
>>
>>     ERROR [MemtablePostFlush:2] 2017-04-05 23:35:46,339
>>     CassandraDaemon.java:229 - Exception in thread
>>     Thread[MemtablePostFlush:2,5,main]
>>     ERROR [MemtablePostFlush:3] 2017-04-05 23:44:08,471
>>     CassandraDaemon.java:229 - Exception in thread
>>     Thread[MemtablePostFlush:3,5,main]
>>     ERROR [MemtablePostFlush:4] 2017-04-05 23:54:41,224
>>     CassandraDaemon.java:229 - Exception in thread
>>     Thread[MemtablePostFlush:4,5,main]
>>     ERROR [MessagingService-Incoming-/10.0.120.52
>>     <http://10.0.120.52>] 2017-04-06 03:19:13,453
>>     CassandraDaemon.java:229 - Exception in thread
>>     Thread[MessagingService-Incoming-/10.0.120.52
>>     <http://10.0.120.52>,5,main]
>>     ERROR [epollEventLoopGroup-2-6] 2017-04-06 03:24:41,006
>>     CassandraDaemon.java:229 - Exception in thread
>>     Thread[epollEventLoopGroup-2-6,10,main]
>>     ERROR [Native-Transport-Requests-36] 2017-04-06 03:25:45,915
>>     JVMStabilityInspector.java:142 - JVM state determined to be unstable.
>>     Exiting forcefully due to:
>>     ERROR [Native-Transport-Requests-49] 2017-04-06 03:25:45,915
>>     JVMStabilityInspector.java:142 - JVM state determined to be unstable.
>>     Exiting forcefully due to:
>>     ERROR [IndexSummaryManager:1] 2017-04-06 03:25:45,915
>>     JVMStabilityInspector.java:142 - JVM state determined to be unstable.
>>     Exiting forcefully due to:
>>     ERROR [Native-Transport-Requests-69] 2017-04-06 03:25:45,916
>>     JVMStabilityInspector.java:142 - JVM state determined to be unstable.
>>     Exiting forcefully due to:
>>     ERROR [Native-Transport-Requests-46] 2017-04-06 03:26:18,465
>>     JVMStabilityInspector.java:142 - JVM state determined to be unstable.
>>     Exiting forcefully due to:
>>     ERROR [SharedPool-Worker-136] 2017-04-06 03:26:18,465
>>     JVMStabilityInspector.java:142 - JVM state determined to be unstable.
>>     Exiting forcefully due to:
>>     ERROR [Native-Transport-Requests-156] 2017-04-06 03:26:18,465
>>     JVMStabilityInspector.java:142 - JVM state determined to be unstable.
>>     Exiting forcefully due to:
>>     ERROR [SharedPool-Worker-92] 2017-04-06 03:26:24,696
>>     JVMStabilityInspector.java:142 - JVM state determined to be unstable.
>>     Exiting forcefully due to:
>>     ERROR [Native-Transport-Requests-48] 2017-04-06 03:26:24,696 ?:? - JVM
>>     state determined to be unstable.  Exiting forcefully due to:
>>     ERROR [Native-Transport-Requests-66] 2017-04-06 03:26:55,808
>>     JVMStabilityInspector.java:142 - JVM state determined to be unstable.
>>     Exiting forcefully due to:
>>     ERROR [Native-Transport-Requests-77] 2017-04-06 03:26:55,808
>>     JVMStabilityInspector.java:142 - JVM state determined to be unstable.
>>     Exiting forcefully due to:
>>     ERROR [GossipTasks:1] 2017-04-06 03:26:55,808
>>     JVMStabilityInspector.java:142 - JVM state determined to be unstable.
>>     Exiting forcefully due to:
>>     ERROR [Native-Transport-Requests-133] 2017-04-06 03:26:55,808
>>     JVMStabilityInspector.java:142 - JVM state determined to be unstable.
>>     Exiting forcefully due to:
>>     ERROR [Native-Transport-Requests-135] 2017-04-06 03:26:55,808
>>     JVMStabilityInspector.java:142 - JVM state determined to be unstable.
>>     Exiting forcefully due to:
>>     ERROR [ScheduledFastTasks:1] 2017-04-06 03:26:55,808
>>     JVMStabilityInspector.java:142 - JVM state determined to be unstable.
>>     Exiting forcefully due to:
>>     ERROR [Native-Transport-Requests-70] 2017-04-06 03:27:11,569
>>     JVMStabilityInspector.java:142 - JVM state determined to be unstable.
>>     Exiting forcefully due to:
>>     ERROR [IndexSummaryManager:1] 2017-04-06 03:27:17,821
>>     CassandraDaemon.java:229 - Exception in thread
>>     Thread[IndexSummaryManager:1,1,main]
>>     ERROR [Native-Transport-Requests-103] 2017-04-06 03:27:24,049
>>     JVMStabilityInspector.java:142 - JVM state determined to be unstable.
>>     Exiting forcefully due to:
>>     ERROR [Native-Transport-Requests-69] 2017-04-06 03:27:24,049
>>     SEPWorker.java:145 - Failed to execute task, unexpected exception
>>     killed
>>     worker: {}
>>     ERROR [SharedPool-Worker-98] 2017-04-06 03:27:24,049
>>     JVMStabilityInspector.java:142 - JVM state determined to be unstable.
>>     Exiting forcefully due to:
>>     ERROR [MessagingService-Incoming-/10.0.120.52
>>     <http://10.0.120.52>] 2017-04-06 03:27:55,079
>>     JVMStabilityInspector.java:142 - JVM state determined to be unstable.
>>     Exiting forcefully due to:
>>     ERROR [epollEventLoopGroup-2-5] 2017-04-06 03:27:55,079
>>     JVMStabilityInspector.java:142 - JVM state determined to be unstable.
>>     Exiting forcefully due to:
>>     ERROR [Native-Transport-Requests-64] 2017-04-06 03:28:43,285
>>     SEPWorker.java:145 - Failed to execute task, unexpected exception
>>     killed
>>     worker: {}
>>     ERROR [MemtablePostFlush:22] 2017-04-06 06:00:26,800
>>     CassandraDaemon.java:229 - Exception in thread
>>     Thread[MemtablePostFlush:22,5,main]
>>     ERROR [MemtablePostFlush:31] 2017-04-06 06:39:19,525
>>     CassandraDaemon.java:229 - Exception in thread
>>     Thread[MemtablePostFlush:31,5,main]
>>
>>     Also some WARNs:
>>
>>     WARN  [main] 2017-04-06 09:26:49,725 CLibrary.java:178 - Unable to
>>     lock
>>     JVM memory (ENOMEM). This can result in part of the JVM being swapped
>>     out, especially with mmapped I/O enabled. Increase RLIMIT_MEMLOCK
>>     or run
>>     Cassandra as root.
>>
>>     WARN  [main] 2017-04-06 09:25:07,355 StartupChecks.java:157 - JMX
>>     is not
>>     enabled to receive remote connections. Please see cassandra-env.sh for
>>     more info.
>>
>>     WARN  [main] 2017-04-06 09:25:07,369 SigarLibrary.java:174 - Cassandra
>>     server running in degraded mode. Is swap disabled? : true,  Address
>>     space adequate? : true,  nofile limit adequate? : false, nproc limit
>>     adequate? : true
>>
>>     WARN  [main] 2017-04-06 09:25:07,091 DatabaseDescriptor.java:493 -
>>     Small
>>     cdc volume detected at /var/lib/cassandra/cdc_raw; setting
>>     cdc_total_space_in_mb to 2502.  You can override this in
>>     cassandra.yaml
>>
>>
>>
> 


Mime
View raw message