cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Viktor Jevdokimov <>
Subject RE: compaction throughput rate not even close to 16MB
Date Thu, 25 Apr 2013 06:04:28 GMT
Our experience with compactions shows that more columns to merge for the same row, more CPU
it takes.

For example, testing and choosing between 2 data models with supercolumns (we still need supercolumns
since composite columns lacks some functionality):
  1. supercolumns with many columns
  2.  supercolumns with one column (columns from model 1 merged to one blob value)
We found that model 2 compaction performs 4 times faster.

The same for regular column families.

Best regards / Pagarbiai

Viktor Jevdokimov
Senior Developer

Phone: +370 5 212 3063
Fax: +370 5 261 0453

J. Jasinskio 16C,
LT-01112 Vilnius,

Disclaimer: The information contained in this message and attachments is intended solely for
the attention and use of the named addressee and may be confidential. If you are not the intended
recipient, you are reminded that the information remains the property of the sender. You must
not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this
message in error, please contact the sender immediately and irrevocably delete this message
and any copies.> -----Original Message-----
> From: Hiller, Dean []
> Sent: Wednesday, April 24, 2013 23:38
> To:
> Subject: Re: compaction throughput rate not even close to 16MB
> Thanks much!!!  Better to hear at least one other person sees the same thing
> ;).  Sometimes these posts just go silent.
> Dean
> From: Edward Capriolo
> <<>>
> Reply-To:
> "<>"
> <<>>
> Date: Wednesday, April 24, 2013 2:33 PM
> To: "<>"
> <<>>
> Subject: Re: compaction throughput rate not even close to 16MB
> I have noticed the same. I think in the "real" world your compaction
> throughput is limited by other things. If I had to speculate I would say that
> compaction can remove expired tombstones, however doing this requires
> bloom filter checks, etc.
> I think that setting is more important with multi threaded compaction and/or
> more compaction slots. In those cases it may actually throttle something.
> On Wed, Apr 24, 2013 at 3:54 PM, Hiller, Dean
> <<>> wrote:
> I was wondering about the compactionthroughput.  I never see ours get
> even close to 16MB and I thought this is supposed to throttle compaction,
> right?  Ours is constantly less than 3MB/sec from looking at our logs or do I
> have this totally wrong?  How can I see the real throughput so that I can
> understand how to throttle it when I need to?
> 94,940,780 bytes to 95,346,024 (~100% of original) in 38,438ms =
> 2.365603MB/s.  2,350,114 total rows, 2,350,022 unique.  Row merge counts
> were {1:2349930, 2:92, }
> Thanks,
> Dean

View raw message