incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Derek Andree <>
Subject Re: Disk usage for CommitLog
Date Tue, 30 Aug 2011 22:57:09 GMT

> > 86GB in commitlog and 42GB in data
> Whoa, that seems really wrong, particularly given your data spans 13 months. Have you
changed any of the default cassandra.yaml setting? What is the maximum memtable_flush_after
across all your CFs? Any warnings/errors in the Cassandra log?

It seems wrong to me too.  It got so bad that /var/lib/cassandra looked like this:

$ du -hs ./*
122G	./commitlog
55G	./data
17M	./saved_caches

I restarted cassandra, and it took a while to chew through all the commitlog files, then disk
utilization was like so:

du -hs ./*
1.1M	./commitlog
56G	./data
17M	./saved_caches

This isn't with 13 months of data, only with a couple months of data.

Upon going through the cassandra logs, I saw a ton of "too many open files" warnings:

 WARN [Thread-4] 2011-08-30 12:07:27,601 (line 112) Transport
error occurred during acceptance of message.
org.apache.thrift.transport.TTransportException: Too many open files
        at org.apache.thrift.transport.TServerSocket.acceptImpl(
        at org.apache.cassandra.thrift.TCustomServerSocket.acceptImpl(
        at org.apache.cassandra.thrift.TCustomServerSocket.acceptImpl(
        at org.apache.thrift.transport.TServerTransport.accept(
        at org.apache.cassandra.thrift.CustomTThreadPoolServer.serve(
        at org.apache.cassandra.thrift.CassandraDaemon$
Caused by: Too many open files
        at Method)
        at org.apache.thrift.transport.TServerSocket.acceptImpl(

I guess I should set the number of allowed files to some big number with ulimit.  Anyone have
a suggestion for how big?  I was thinking ulimit -n 10000, but first I'm going to try to reproduce
the "too many files open" condition and then have a look at lsof to see just how many files
are really open.

On a side note, why does cassandra seem to log to /var/log/cassandra.log no matter what's
in  I ended up having to link that to /dev/null to keep from filling up
my root partition with cassandra logs that I already have elsewhere on another filesystem.


View raw message