cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Jorgensen (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CASSANDRA-11842) Unbounded commit log file growth
Date Wed, 18 May 2016 21:51:12 GMT
Andrew Jorgensen created CASSANDRA-11842:
--------------------------------------------

             Summary: Unbounded commit log file growth
                 Key: CASSANDRA-11842
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11842
             Project: Cassandra
          Issue Type: Bug
         Environment: Cassandra version 3.0.3 on Ubuntu Trusty
            Reporter: Andrew Jorgensen
         Attachments: disks-space.png

Today I noticed that 2 nodes in a 54 node cluster have been using up disk space at a constant
rate for the last 3 days or so. 

!disks-space.png|thumnnail!

When I looked into it I found that the majority of the disk space was being used up in /mnt/cassandra/commitlog.
It looked like there were files dating back to when the disk usage started to increase on
5/16 and there were a total of ~13K commit log files in this directory.

I was curious if anyone has seen this before. I am not sure what would cause this behavior,
especially on two separate nodes in the cluster at about the same time. I think this points
to something about the data, we have a replication factor of 2 which seems to match up with
the number of nodes that were affected.

The two nodes in question looked down from every other node in the clusters perspective when
doing `nodetool` status but when running that on the affected nodes the entire cluster looked
 like it was up and running.

To remedy the situation I tried running `nodetool drain` on one of the affected nodes but
it seemed to be hung and I couldnt get a handle on if it was doing anything or not. I restarted
the cassandra process and could see in the debug log that it was reading in the commit log
files. On the second node I moved the commit log folder to a different location and restarted
the node which cause it to immediately rejoin the cluster and I can go re-play the commit
log files that were queued up later to make sure its in a consistent state. So far it looks
like the commit log file growth on that node is not growing unboundedly.

As far as I could tell the data in /mnt/cassandra/data/ for each of the keyspaces and tables
had recent timestamps on the file which I believe means that flushing was happening and data
was getting written to the SStables, also 350GB of commitlog wouldnt have been able to fit
into memory.

If there is any other information I can provide please let me know. I didnt see much in the
cassandra system.log or debug.log file but would be happy to provide them if it'll help.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message