cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dan Hendry (Created) (JIRA)" <>
Subject [jira] [Created] (CASSANDRA-3592) Major Compaction Incredibly Slow
Date Thu, 08 Dec 2011 16:39:40 GMT
Major Compaction Incredibly Slow

                 Key: CASSANDRA-3592
             Project: Cassandra
          Issue Type: Bug
          Components: Core
    Affects Versions: 1.0.3
         Environment: RHEL6 - 24 core machines 24 GB mem total, 11 GB java heap
java version "1.6.0_26"
6 node cluster (4@0.8.6, 2@1.0.3)
            Reporter: Dan Hendry

Twice now (on different nodes), I have observed major compaction for certain column families
take *significantly* longer on 1.0.3 in comparison to 0.8.6. For example,

On the 0.8.6 node, the post compaction log message:
{noformat} (line 608) Compacted to XXX. 339,164,959,170 to 158,825,469,883
(~46% of original) bytes for 25,996 keys.  Time: 26,934,317ms.{noformat}

On the 1.0.3 node, the post compaction log message:
{noformat} (line 213) Compacted to [XXX].  222,338,354,529 to 147,751,403,084
(~66% of original) bytes for 26,100 keys at 0.562045MB/s. Time: 250,703,563ms.{noformat}

So... literally an order of magnitude slower on 1.0.3 in comparison to 0.8.6.

Relevant configuration settings:
* compaction_throughput_mb_per_sec: 0 (why? because the compaction throttling logic as currently
implemented is highly unsuitable for wide rows but thats a different issue)
* in_memory_compaction_limit_in_mb: 128

Column family characteristics:
* Many wide rows (~5% of rows greater than > 10MB and hundreds of rows greater than 100
MB, with many small columns).
* Heavy use of expiring columns - each row represents data for a particular hour so typically
all columns in the row will expire together.
* The significant size shrinkage as reported by the log messages is due mainly to expired
data being cleaned up (I typically trigger major compaction when 30-50% of the on disk data
has expired which is about once every 3 weeks per node).
* Perhaps obviously: size tiered compaction and no compression (the schema has not changed
since the partial upgrade to 1.0.x)
* Standard column family

Performance notes during compaction:
* Nice CPU usage and load average is basically the same between 0.8.6 and 1.0.3 - ie, compaction
IS running and is not getting stalled or hung up anywhere. 
* Compaction is IO bound on the 0.8.6 machines - the disks see heavy, constant utilization
when compaction is running.
* Compaction is uses virtually no CPU on the 1.0.3 machines - disk utilization is virtually
no different when compacting vs not compacting (but at the same time, CPU usage and load average
clearly indicate that compaction IS running).

Finally, I have not had time to profile more thoroughly but jconsole always shows the following
stacktrace for the active compaction thread (for the 1.0.3 machine):

Stack trace: 

I have not really had time to debug the issue further but given the stack trace and column
family characteristics, I suspect some problem with the column family column iterator .remove()
method. If the underlying ISortedColumns implementation is a ArrayBackedSortedColumns instance,
then repeated .remove() calls could be incredibly inefficient.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message