Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 23699 invoked from network); 20 Oct 2010 19:43:26 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 20 Oct 2010 19:43:26 -0000 Received: (qmail 56952 invoked by uid 500); 20 Oct 2010 19:43:24 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 56931 invoked by uid 500); 20 Oct 2010 19:43:24 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 56922 invoked by uid 99); 20 Oct 2010 19:43:24 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 20 Oct 2010 19:43:24 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of cassuser@gmail.com designates 74.125.83.172 as permitted sender) Received: from [74.125.83.172] (HELO mail-pv0-f172.google.com) (74.125.83.172) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 20 Oct 2010 19:43:18 +0000 Received: by pvg4 with SMTP id 4so610646pvg.31 for ; Wed, 20 Oct 2010 12:42:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type; bh=IcG1WMe5KroNG0aTdejGyYPYbxC3dAi3XUYjsYFGCqU=; b=pa0IFqnkpv1lAA+6gDzkXAmcYZmB+wDL3yJgN/IfQHv0GFgoMWvONyDmrGdxLCIA52 DiCJwlAiSs3a6JGqxU1fj9UAALOe8qb/t1gyE8IJhUcOsTj94j0V8cE6tgHHmbG8IIfj tm42Hqa3Z9ZyG6dA/cTtv0xtOZ4/SBmbw2qOI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=nYu6/TagIwGVN6ryumONPPKwftvQVKNVSQbXeaPzmXC2+rXafLsF+b8pJxTOeU5uJP BHoa43IYjChG0qGRfrFqt1gZWMjlojc7LczJ10M+1EvHS3gUZJIlyQwfCn+zNY1Qa3Sa oPE/Ch6tdxlvy3pFxEYVusnwKECfGV9kRhaDU= MIME-Version: 1.0 Received: by 10.142.135.7 with SMTP id i7mr6352618wfd.17.1287603778099; Wed, 20 Oct 2010 12:42:58 -0700 (PDT) Received: by 10.220.189.8 with HTTP; Wed, 20 Oct 2010 12:42:57 -0700 (PDT) In-Reply-To: References: Date: Wed, 20 Oct 2010 12:42:57 -0700 Message-ID: Subject: Re: memtable sstable questions (0.6.4) From: CassUser CassUser To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=000e0cd32d2ace1ab10493119d5f --000e0cd32d2ace1ab10493119d5f Content-Type: text/plain; charset=ISO-8859-1 Thanks for the link. #2 was not meant to be trick question, it just came out like that :). what i was after is the overhead associated with large number of keyspaces and column families (i didn't mean empty memtables :). If a few keyspaces that have 20 or so column families with a percentage of rows cached. Does this effect write performance to other keyspaces in the cluster? On Wed, Oct 20, 2010 at 12:01 PM, Edward Capriolo wrote: > On Wed, Oct 20, 2010 at 2:47 PM, CassUser CassUser > wrote: > > Hey, > > > > As I understand it writes go directly to the commit log. Once a > threshold > > has been reached the data is shipped to a memtable, and again to an > sstable. > > > > 1. How many memtables are created when a flush happens from a commit log? > > One per CF? > > > > 2. Is there any space associated with an empty memtable? > > > > 3. When a flush happens from a memtable to an sstable, does this create a > > single new sstable? > > > > 4. Should compaction be turned off during a large data load? > > > > Thanks. > > > > Take a look at: > > > http://wiki.apache.org/cassandra/MemtableSSTable > > 1 and 3 > Memtables flush for three reasons size, time, and number of > operations. There is one memtable per column family. Each memtable > flushes individually. > > 2. Is this a trick question? > > 4. Should compaction be turned off during a large data load? > You can disable compaction during bulk loads. This can help because > otherwise the same data might be compacted multiple times. However if > you go to long with compaction turned off you end up with multiple > sstables. This can end up in fragmented rows. > --000e0cd32d2ace1ab10493119d5f Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Thanks for the link.=A0

#2 was not meant to be trick question, it j= ust came out like that :).=A0 what i was after is the overhead associated w= ith large number of keyspaces and column families (i didn't mean empty = memtables :).=A0 If a few keyspaces that have 20 or so column families with= a percentage of rows cached.=A0 Does this effect write performance to othe= r keyspaces in the cluster?=A0



On Wed, Oct 20, 2010 at 12:01 PM, Ed= ward Capriolo <edlinuxguru@gmail.com> wrote:
On Wed, Oct 20, 2010 at 2:47 PM, CassUser CassUser <= ;cassuser@gmail.com= > wrote:
> Hey,
>
> As I understand it writes go directly to the commit log.=A0 Once a thr= eshold
> has been reached the data is shipped to a memtable, and again to an ss= table.
>
> 1. How many memtables are created when a flush happens from a commit l= og?
> One per CF?
>
> 2. Is there any space associated with an empty memtable?
>
> 3. When a flush happens from a memtable to an sstable, does this creat= e a
> single new sstable?
>
> 4. Should compaction be turned off during a large data load?
>
> Thanks.
>

Take a look at:


http://wiki.apache.org/cassandra/MemtableSSTable

1 and 3
Memtables flush for three reasons size, time, and number of
operations. There is one memtable per column family. Each memtable
flushes individually.

2. Is this a trick question?

4. Should compaction be turned off during a large data load?
You can disable compaction during bulk loads. This can help because otherwise the same data might be compacted multiple times. However if
you go to long with compaction turned off you end up with multiple
sstables. This can end up in fragmented rows.

--000e0cd32d2ace1ab10493119d5f--