Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (athena.apache.org: domain of jbellis@gmail.com designates
 209.85.214.44 as permitted sender)
MIME-Version: 1.0
In-Reply-To: <4F1CE170.9080101@gmail.com>
References: <4F1CE170.9080101@gmail.com>
From: Jonathan Ellis <jbellis@gmail.com>
Date: Tue, 7 Feb 2012 12:40:08 -0600
Message-ID: 
 <CALdd-ziDG+fizSbCxSQsDX4yK8q35=wJeRQWqsVtH2WfMyMthg@mail.gmail.com>
Subject: Re: Leveled Compaction Strategy; Expected number of files over time?
To: user@cassandra.apache.org
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

It looks like what you're seeing is, stress far outpaced the ability
of compaction to keep up (which is normal for our default settings,
which prioritize maintaining request throughput over compaction), so
LCS will grab a bunch of L0 sstables, compact them together with L1
resulting in a spike of L1 sstables, then compact those upwards into
higher levels, gradually lowering the sstable count.

It's unclear how to improve the "LCS can't keep up" case [1].  But
it's worth noting that a single large stress insert run, consisting as
it does of a large volume of unique rows, is the worst case for LCS.
This is the primary reason LCS is not the default: if you have an
append-mostly write load with few overwrites or deletes, LCS will do a
lot of extra i/o for no real benefit.

[1] https://issues.apache.org/jira/browse/CASSANDRA-3854

On Sun, Jan 22, 2012 at 10:26 PM, Chris Burroughs
<chris.burroughs@gmail.com> wrote:
> I inserted a large number of keys to a single node using stress.java [1]
> and let things sit for a while (several hours with no more inserts).
> After a bit I decided something might be up and started sampling the
> number of files in the data directory for 250 minutes while I played The
> Legend of Zelda. =A0At the start there were 78291 files, and the end
> 78599. =A0All I see in the log is a lot of "Compacting to" and "Compacted=
"
> messages. =A0The output of compactionstatus also seemed odd:
>
> $ ./bin/nodetool -h localhost -p 10101 compactionstats
> pending tasks: 3177
> =A0 =A0 =A0 =A0 =A0compaction type =A0 =A0 =A0 =A0keyspace =A0 column fam=
ily bytes
> compacted =A0 =A0 bytes total =A0progress
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 Compaction =A0 =A0 =A0 Keyspace1 =A0 =A0 =A0 =
Standard1
> 250298718 =A0 =A0 =A0 =A0 =A0 =A0 =A0 0 =A0 =A0 =A0 n/a
>
>
> Below is a graph showing an oscillation in the number of files.
>
> Is this how leveled compaction strategy is expected to behave? =A0If so,
> is it ever 'done'?
>
> http://img836.imageshack.us/img836/7294/levelcompactionfiles.png
>
> [1] (ran three times) ./bin/stress -d HOST --random -l 1 -o insert -c 25
> -e ONE --average-size-values -C 100 -t 75 -n 75000000
>
> with this config (dupliate options in original, but I don't think that
> should matter)
>
> update column family Standard1 with rows_cached=3D1000000 and
> keys_cached=3D0 and compaction_strategy =3D 'LeveledCompactionStrategy' a=
nd
> compaction_strategy_options =3D {sstable_size_in_mb:10} and
> compaction_strategy_options =3D {sstable_size_in_mb:10} and
> compression_options=3D{sstable_compression:SnappyCompressor,
> chunk_length_kb:64} and row_cache_provider =3D
> 'ConcurrentLinkedHashCacheProvider' and row_cache_keys_to_save =3D 20000
> and row_cache_save_period =3D 120;


--=20
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com