incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Endless minor compactions after heavy inserts
Date Fri, 01 Apr 2011 11:59:39 GMT
If you are doing some sort of bulk load you can disable minor compactions by setting the min_compaction_threshold
and max_compaction_threshold to 0 . Then once your insert is complete run a major compaction
via nodetool before turning the minor compaction back on. 

You can also reduce the compaction threads priority, see compaction_thread_priority in the
yaml file. 

The memtable will be flushed when either the MB or ops throughput is triggered. If you are
seeing a lot of memtables smaller than the MB threshold then the ops threshold is probably
been triggered. Look for a log message at INFO level starting with "Enqueuing flush of Memtable"
that will tell you how many bytes and ops the memtable had when it was flushed. Trying increasing
the ops threshold and see what happens. 

You're change in the compaction threshold may not have an an effect because the compaction
process was already running.  

AFAIK the best way to get the best out of your 10 disks will be to use a dedicated mirror
for the commit log and a  stripe set for the data. 

Hope that helps. 
Aaron
  
On 1 Apr 2011, at 14:52, Sheng Chen wrote:

> I've got a single node of cassandra 0.7.4, and I used the java stress tool to insert
about 100 million records.
> The inserts took about 6 hours (45k inserts/sec) but the following minor compactions
last for 2 days and the pending compaction jobs are still increasing.
> 
> From jconsole I can read the MemtableThroughputInMB=1499, MemtableOperationsInMillions=7.0
> But in my data directory, I got hundreds of 438MB data files, which should be the cause
of the minor compactions.
> 
> I tried to set compaction threshold by nodetool, but it didn't seem to take effects (no
change in pending compaction tasks).
> After restarting the node, my setting is lost.
> 
> I want to distribute the read load in my disks (10 disks in xfs, LVM), so I don't want
to do a major compaction.
> So, what can I do to keep the sstable file in a reasonable size, or to make the minor
compactions faster?
> 
> Thank you in advance.
> Sheng
> 


Mime
View raw message