cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <>
Subject [Cassandra Wiki] Update of "LargeDataSetConsiderations" by PeterSchuller
Date Sat, 18 Dec 2010 16:23:57 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification.

The "LargeDataSetConsiderations" page has been changed by PeterSchuller.


  Unless otherwise noted, the points refer to Cassandra 0.7 and above.
+  * Disk space usage in Cassandra can vary fairly suddenly over time. If you have significant
amounts of data such that available disk space is not significantly higher than usage, consider:
+   * Compaction of a column family can up to double the disk space used by said column family
(in the case of a major compaction and no deletions).
+   * Repair operations can increase disk space demands (particularly in 0.6, less so in 0.7;
TODO: provide actual maximum growth and what it depends on).
   * As your data set becomes larger and larger (assuming significantly larger than memory),
you become more and more dependent on caching to elide I/O operations. As you plan and test
your capacity, keep min mind that:
    * The cassandra row cache is in the JVM heap and un-affected (remains warm) by compactions
and repair operations.
    * The key cache is affected by compaction and repair.

View raw message