incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From CassUser CassUser <cassu...@gmail.com>
Subject Re: memtable sstable questions (0.6.4)
Date Wed, 20 Oct 2010 20:17:43 GMT
Cool thanks, that helps.

So even if we have defined a column family in the storage-conf and it's
empty, this has some overhead in cassandra and the following rule should
apply:

memtable_throughput_in_mb * 3 * number of hot CFs + 1G + internal caches.



On Wed, Oct 20, 2010 at 12:53 PM, Aaron Morton <aaron@thelastpickle.com>wrote:

> Take a look at the section on JVM Heap size here
> http://wiki.apache.org/cassandra/MemtableThresholds
>
> <http://wiki.apache.org/cassandra/MemtableThresholds>CF's have a large
> overhead, Keyspaces have none/little.
>
> In general write performance will be affected by the memtable thresholds
> (also on the link above). Read performance will be affected by the size  of
> the cassandra caches and OS file caches. Compaction can slow a node, 0.7
> handles this better via the dynamic snitch.
>
> Start with conservative / default values, then crank things up.
>
> Aaron
>
> On 21 Oct, 2010,at 08:42 AM, CassUser CassUser <cassuser@gmail.com> wrote:
>
> Thanks for the link.
>
> #2 was not meant to be trick question, it just came out like that :).  what
> i was after is the overhead associated with large number of keyspaces and
> column families (i didn't mean empty memtables :).  If a few keyspaces that
> have 20 or so column families with a percentage of rows cached.  Does this
> effect write performance to other keyspaces in the cluster?
>
>
>
> On Wed, Oct 20, 2010 at 12:01 PM, Edward Capriolo <edlinuxguru@gmail.com>wrote:
>
>>
>> On Wed, Oct 20, 2010 at 2:47 PM, CassUser CassUser <cassuser@gmail.com>
>> wrote:
>> > Hey,
>> >
>> > As I understand it writes go directly to the commit log.  Once a
>> threshold
>> > has been reached the data is shipped to a memtable, and again to an
>> sstable.
>> >
>> > 1. How many memtables are created when a flush happens from a commit
>> log?
>> > One per CF?
>> >
>> > 2. Is there any space associated with an empty memtable?
>> >
>> > 3. When a flush happens from a memtable to an sstable, does this create
>> a
>> > single new sstable?
>> >
>> > 4. Should compaction be turned off during a large data load?
>> >
>> > Thanks.
>> >
>>
>> Take a look at:
>>
>>
>> http://wiki.apache.org/cassandra/MemtableSSTable
>>
>> 1 and 3
>> Memtables flush for three reasons size, time, and number of
>> operations. There is one memtable per column family. Each memtable
>> flushes individually.
>>
>> 2. Is this a trick question?
>>
>> 4. Should compaction be turned off during a large data load?
>> You can disable compaction during bulk loads. This can help because
>> otherwise the same data might be compacted multiple times. However if
>> you go to long with compaction turned off you end up with multiple
>> sstables. This can end up in fragmented rows.
>>
>
>

Mime
View raw message