cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From CassUser CassUser <cassu...@gmail.com>
Subject Re: memtable sstable questions (0.6.4)
Date Wed, 20 Oct 2010 19:42:57 GMT
Thanks for the link.

#2 was not meant to be trick question, it just came out like that :).  what
i was after is the overhead associated with large number of keyspaces and
column families (i didn't mean empty memtables :).  If a few keyspaces that
have 20 or so column families with a percentage of rows cached.  Does this
effect write performance to other keyspaces in the cluster?



On Wed, Oct 20, 2010 at 12:01 PM, Edward Capriolo <edlinuxguru@gmail.com>wrote:

> On Wed, Oct 20, 2010 at 2:47 PM, CassUser CassUser <cassuser@gmail.com>
> wrote:
> > Hey,
> >
> > As I understand it writes go directly to the commit log.  Once a
> threshold
> > has been reached the data is shipped to a memtable, and again to an
> sstable.
> >
> > 1. How many memtables are created when a flush happens from a commit log?
> > One per CF?
> >
> > 2. Is there any space associated with an empty memtable?
> >
> > 3. When a flush happens from a memtable to an sstable, does this create a
> > single new sstable?
> >
> > 4. Should compaction be turned off during a large data load?
> >
> > Thanks.
> >
>
> Take a look at:
>
>
> http://wiki.apache.org/cassandra/MemtableSSTable
>
> 1 and 3
> Memtables flush for three reasons size, time, and number of
> operations. There is one memtable per column family. Each memtable
> flushes individually.
>
> 2. Is this a trick question?
>
> 4. Should compaction be turned off during a large data load?
> You can disable compaction during bulk loads. This can help because
> otherwise the same data might be compacted multiple times. However if
> you go to long with compaction turned off you end up with multiple
> sstables. This can end up in fragmented rows.
>

Mime
View raw message