incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Morton <>
Subject Re: memtable sstable questions (0.6.4)
Date Wed, 20 Oct 2010 19:53:14 GMT
Take a look at the section on JVM Heap size here

CF's have a large overhead, Keyspaces have none/little. 

In general write performance will be affected by the memtable thresholds (also on the link
above). Read performance will be affected by the size  of the cassandra caches and OS file
caches. Compaction can slow a node, 0.7 handles this better via the dynamic snitch.

Start with conservative / default values, then crank things up. 

On 21 Oct, 2010,at 08:42 AM, CassUser CassUser <cassuser@gmailcom> wrote:

Thanks for the link.  

#2 was not meant to be trick question, it just came out like that :).  what i was after is
the overhead associated with large number of keyspaces and column families (i didn't mean
empty memtables :).  If a few keyspaces that have 20 or so column families with a percentage
of rows cached.  Does this effect write performance to other keyspaces in the cluster? 

On Wed, Oct 20, 2010 at 12:01 PM, Edward Capriolo <> wrote:

On Wed, Oct 20, 2010 at 2:47 PM, CassUser CassUser <> wrote:
> Hey,
> As I understand it writes go directly to the commit log.  Once a threshold
> has been reached the data is shipped to a memtable, and again to an sstable.
> 1. How many memtables are created when a flush happens from a commit log?
> One per CF?
> 2. Is there any space associated with an empty memtable?
> 3. When a flush happens from a memtable to an sstable, does this create a
> single new sstable?
> 4. Should compaction be turned off during a large data load?
> Thanks.

Take a look at:

1 and 3
Memtables flush for three reasons size, time, and number of
operations. There is one memtable per column family. Each memtable
flushes individually.

2. Is this a trick question?

4. Should compaction be turned off during a large data load?
You can disable compaction during bulk loads. This can help because
otherwise the same data might be compacted multiple times. However if
you go to long with compaction turned off you end up with multiple
sstables. This can end up in fragmented rows.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
    • Unnamed multipart/related (inline, None, 0 bytes)
View raw message