cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ananth Gundabattula (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-5605) Crash caused by insufficient disk space to flush
Date Thu, 11 Jul 2013 03:57:49 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13705442#comment-13705442
] 

Ananth Gundabattula commented on CASSANDRA-5605:
------------------------------------------------

Am not sure if the following information helps but we too hit this issue in production today.
We were running with cassandra 1.2.4 and two patches CASSANDRA-5554 & CASSANDRA-5418.


We were running with RF=3 and LCS. 

We cross checked using JMX if blacklisting is the cause of this bug and it looks like it is
definitely not the case. 

We however saw a pile up of pending compactions ~ 1800 pending compactions per node when node
crashed. Surprising thing is that the "Insufficient disk space to write xxxx bytes" appears
much before the node crashes. For us it started appearing aprrox 3 hours before the node crashed.


The cluster which showed this behavior was having loads of writes occurring ( We were using
multiple SSTableLoaders to stream data into this cluster. ). We pushed in almost 15 TB worth
data ( including the RF =3 ) in a matter of 16 hours. We were not serving any reads from this
cluster as we were still migrating data to it. 

Another interesting behavior observed that nodes were neighbors in most of the time. 

Am not sure if the above information helps but wanted to add it to the context of the ticket.
 
                
> Crash caused by insufficient disk space to flush
> ------------------------------------------------
>
>                 Key: CASSANDRA-5605
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5605
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.2.3, 1.2.5
>         Environment: java version "1.7.0_15"
>            Reporter: Dan Hendry
>
> A few times now I have seen our Cassandra nodes crash by running themselves out of memory.
It starts with the following exception:
> {noformat}
> ERROR [FlushWriter:13000] 2013-05-31 11:32:02,350 CassandraDaemon.java (line 164) Exception
in thread Thread[FlushWriter:13000,5,main]
> java.lang.RuntimeException: Insufficient disk space to write 8042730 bytes
>         at org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:42)
>         at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:722)
> {noformat} 
> After which, it seems the MemtablePostFlusher stage gets stuck and no further memtables
get flushed: 
> {noformat} 
> INFO [ScheduledTasks:1] 2013-05-31 11:59:12,467 StatusLogger.java (line 68) MemtablePostFlusher
              1        32         0
> INFO [ScheduledTasks:1] 2013-05-31 11:59:12,469 StatusLogger.java (line 73) CompactionManager
                1         2
> {noformat} 
> What makes this ridiculous is that, at the time, the data directory on this node had
981GB free disk space (as reported by du). We primarily use STCS and at the time the aforementioned
exception occurred, at least one compaction task was executing which could have easily involved
981GB (or more) worth of input SSTables. Correct me if I am wrong but but Cassandra counts
data currently being compacted against available disk space. In our case, this is a significant
overestimation of the space required by compaction since a large portion of the data being
compacted has expired or is an overwrite.
> More to the point though, Cassandra should not crash because its out of disk space unless
its really actually out of disk space (ie, dont consider 'phantom' compaction disk usage when
flushing). I have seen one of our nodes die in this way before our alerts for disk space even
went off.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message