cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alain RODRIGUEZ <arodr...@gmail.com>
Subject Re: Freeing up disk space on Cassandra 1.1.5 with Size-Tiered compaction.
Date Thu, 22 Nov 2012 20:18:54 GMT
Hi Alexandru,

"We are running a 3 node Cassandra 1.1.5 cluster with a 3TB Raid 0 disk per
node for the data dir and separate disk for the commitlog, 12 cores, 24 GB
RAM"

I think you should tune your architecture in a very different way. From
what I know having too much data on one node is bad, not really sure why,
but  I think that performance will go down due to the size of indexes and
bloom filters (I may be wrong on the reasons but I'm quite sure you can't
store too much data per node).

Anyway, I am 6 nodes with half of these resources (6 cores / 12GB) would be
better if you have the choice.

"(12GB to Cassandra heap)."

The max heap recommanded is 8GB because if you use more than these 8GB the
Gc jobs will start decreasing your performance.

"We now have 1.1 TB worth of data per node (RF = 2)."

You should use RF=3 unless one out of consistency or SPOF  doesn't matter
to you.

With RF=2 you are obliged to write at CL.one to remove the single point of
failure.

"1. Start issuing regular major compactions (nodetool compact).
     - This is not recommended:
            - Stops minor compactions.
            - Major performance hit on node (very bad for us because need
to be taking data all the time)."

Actually, major compaction *does not* stop minor compactions. What happens
is that due to the size of the size of the sstable that remains after your
major compaction, it will never be compacted with the upcoming new
sstables, and because of that, your read performance will go down until you
run an other major compaction.

"2. Switch to Leveled compaction strategy.
      - It is mentioned to help with deletes and disk space usage. Can
someone confirm?"

 From what I know, Leveled compaction will not free disk space. It will
allow you to use a greater percentage of your total disk space (50% max for
sized tier compaction vs about 80% for leveled compaction)

"Our usage pattern is write once, read once (export) and delete once! "

In this case, I think that leveled compaction fits your needs.

"Can anyone suggest which (if any) is better? Are there better solutions?"

Are your sstable compressed ? You have 2 types of built-in compression and
you may use them depending on the model of each of your CF.

see:
http://www.datastax.com/docs/1.1/operations/tuning#configure-compression

Alain

2012/11/22 Alexandru Sicoe <adsicoe@gmail.com>

> We are running a 3 node Cassandra 1.1.5 cluster with a 3TB Raid 0 disk per
> node for the data dir and separate disk for the commitlog, 12 cores, 24 GB
> RAM (12GB to Cassandra heap).

Mime
View raw message