cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nikolai Grigoriev <ngrigor...@gmail.com>
Subject Re: Compaction Strategy guidance
Date Mon, 24 Nov 2014 14:18:53 GMT
Andrei,

Oh, Monday mornings...Tb :)

On Mon, Nov 24, 2014 at 9:12 AM, Andrei Ivanov <aivanov@iponweb.net> wrote:

> Nikolai,
>
> Are you sure about 1.26Gb? Like it doesn't look right - 5195 tables
> with 256Mb table size...
>
> Andrei
>
> On Mon, Nov 24, 2014 at 5:09 PM, Nikolai Grigoriev <ngrigoriev@gmail.com>
> wrote:
> > Jean-Armel,
> >
> > I have only two large tables, the rest is super-small. In the test
> cluster
> > of 15 nodes the largest table has about 110M rows. Its total size is
> about
> > 1,26Gb per node (total disk space used per node for that CF). It's got
> about
> > 5K sstables per node - the sstable size is 256Mb. cfstats on a "healthy"
> > node look like this:
> >
> >     Read Count: 8973748
> >     Read Latency: 16.130059053251774 ms.
> >     Write Count: 32099455
> >     Write Latency: 1.6124713938912671 ms.
> >     Pending Tasks: 0
> >         Table: wm_contacts
> >         SSTable count: 5195
> >         SSTables in each level: [27/4, 11/10, 104/100, 1053/1000, 4000,
> 0,
> > 0, 0, 0]
> >         Space used (live), bytes: 1266060391852
> >         Space used (total), bytes: 1266144170869
> >         SSTable Compression Ratio: 0.32604853410787327
> >         Number of keys (estimate): 25696000
> >         Memtable cell count: 71402
> >         Memtable data size, bytes: 26938402
> >         Memtable switch count: 9489
> >         Local read count: 8973748
> >         Local read latency: 17.696 ms
> >         Local write count: 32099471
> >         Local write latency: 1.732 ms
> >         Pending tasks: 0
> >         Bloom filter false positives: 32248
> >         Bloom filter false ratio: 0.50685
> >         Bloom filter space used, bytes: 20744432
> >         Compacted partition minimum bytes: 104
> >         Compacted partition maximum bytes: 3379391
> >         Compacted partition mean bytes: 172660
> >         Average live cells per slice (last five minutes): 495.0
> >         Average tombstones per slice (last five minutes): 0.0
> >
> > Another table of similar structure (same number of rows) is about 4x
> times
> > smaller. That table does not suffer from those issues - it compacts well
> and
> > efficiently.
> >
> > On Mon, Nov 24, 2014 at 2:30 AM, Jean-Armel Luce <jaluce06@gmail.com>
> wrote:
> >>
> >> Hi Nikolai,
> >>
> >> Please could you clarify a little bit what you call "a large amount of
> >> data" ?
> >>
> >> How many tables ?
> >> How many rows in your largest table ?
> >> How many GB in your largest table ?
> >> How many GB per node ?
> >>
> >> Thanks.
> >>
> >>
> >>
> >> 2014-11-24 8:27 GMT+01:00 Jean-Armel Luce <jaluce06@gmail.com>:
> >>>
> >>> Hi Nikolai,
> >>>
> >>> Thanks for those informations.
> >>>
> >>> Please could you clarify a little bit what you call "
> >>>
> >>> 2014-11-24 4:37 GMT+01:00 Nikolai Grigoriev <ngrigoriev@gmail.com>:
> >>>>
> >>>> Just to clarify - when I was talking about the large amount of data
I
> >>>> really meant large amount of data per node in a single CF (table).
> LCS does
> >>>> not seem to like it when it gets thousands of sstables (makes 4-5
> levels).
> >>>>
> >>>> When bootstraping a new node you'd better enable that option from
> >>>> CASSANDRA-6621 (the one that disables STCS in L0). But it will still
> be a
> >>>> mess - I have a node that I have bootstrapped ~2 weeks ago. Initially
> it had
> >>>> 7,5K pending compactions, now it has almost stabilized ad 4,6K. Does
> not go
> >>>> down. Number of sstables at L0  is over 11K and it is slowly slowly
> building
> >>>> upper levels. Total number of sstables is 4x the normal amount. Now
I
> am not
> >>>> entirely sure if this node will ever get back to normal life. And
> believe me
> >>>> - this is not because of I/O, I have SSDs everywhere and 16 physical
> cores.
> >>>> This machine is barely using 1-3 cores at most of the time. The
> problem is
> >>>> that allowing STCS fallback is not a good option either - it will
> quickly
> >>>> result in a few 200Gb+ sstables in my configuration and then these
> sstables
> >>>> will never be compacted. Plus, it will require close to 2x disk space
> on
> >>>> EVERY disk in my JBOD configuration...this will kill the node sooner
> or
> >>>> later. This is all because all sstables after bootstrap end at L0 and
> then
> >>>> the process slowly slowly moves them to other levels. If you have
> write
> >>>> traffic to that CF then the number of sstables and L0 will grow
> quickly -
> >>>> like it happens in my case now.
> >>>>
> >>>> Once something like
> https://issues.apache.org/jira/browse/CASSANDRA-8301
> >>>> is implemented it may be better.
> >>>>
> >>>>
> >>>> On Sun, Nov 23, 2014 at 4:53 AM, Andrei Ivanov <aivanov@iponweb.net>
> >>>> wrote:
> >>>>>
> >>>>> Stephane,
> >>>>>
> >>>>> We are having a somewhat similar C* load profile. Hence some comments
> >>>>> in addition Nikolai's answer.
> >>>>> 1. Fallback to STCS - you can disable it actually
> >>>>> 2. Based on our experience, if you have a lot of data per node,
LCS
> >>>>> may work just fine. That is, till the moment you decide to join
> >>>>> another node - chances are that the newly added node will not be
able
> >>>>> to compact what it gets from old nodes. In your case, if you switch
> >>>>> strategy the same thing may happen. This is all due to limitations
> >>>>> mentioned by Nikolai.
> >>>>>
> >>>>> Andrei,
> >>>>>
> >>>>>
> >>>>> On Sun, Nov 23, 2014 at 8:51 AM, Servando Muñoz G. <smgesi@gmail.com
> >
> >>>>> wrote:
> >>>>> > ABUSE
> >>>>> >
> >>>>> >
> >>>>> >
> >>>>> > YA NO QUIERO MAS MAILS SOY DE MEXICO
> >>>>> >
> >>>>> >
> >>>>> >
> >>>>> > De: Nikolai Grigoriev [mailto:ngrigoriev@gmail.com]
> >>>>> > Enviado el: sábado, 22 de noviembre de 2014 07:13 p. m.
> >>>>> > Para: user@cassandra.apache.org
> >>>>> > Asunto: Re: Compaction Strategy guidance
> >>>>> > Importancia: Alta
> >>>>> >
> >>>>> >
> >>>>> >
> >>>>> > Stephane,
> >>>>> >
> >>>>> > As everything good, LCS comes at certain price.
> >>>>> >
> >>>>> > LCS will put most load on you I/O system (if you use spindles
- you
> >>>>> > may need
> >>>>> > to be careful about that) and on CPU. Also LCS (by default)
may
> fall
> >>>>> > back to
> >>>>> > STCS if it is falling behind (which is very possible with heavy
> >>>>> > writing
> >>>>> > activity) and this will result in higher disk space usage.
Also LCS
> >>>>> > has
> >>>>> > certain limitation I have discovered lately. Sometimes LCS
may not
> be
> >>>>> > able
> >>>>> > to use all your node's resources (algorithm limitations) and
this
> >>>>> > reduces
> >>>>> > the overall compaction throughput. This may happen if you have
a
> >>>>> > large
> >>>>> > column family with lots of data per node. STCS won't have this
> >>>>> > limitation.
> >>>>> >
> >>>>> >
> >>>>> >
> >>>>> > By the way, the primary goal of LCS is to reduce the number
of
> >>>>> > sstables C*
> >>>>> > has to look at to find your data. With LCS properly functioning
> this
> >>>>> > number
> >>>>> > will be most likely between something like 1 and 3 for most
of the
> >>>>> > reads.
> >>>>> > But if you do few reads and not concerned about the latency
today,
> >>>>> > most
> >>>>> > likely LCS may only save you some disk space.
> >>>>> >
> >>>>> >
> >>>>> >
> >>>>> > On Sat, Nov 22, 2014 at 6:25 PM, Stephane Legay
> >>>>> > <slegay@looplogic.com>
> >>>>> > wrote:
> >>>>> >
> >>>>> > Hi there,
> >>>>> >
> >>>>> >
> >>>>> >
> >>>>> > use case:
> >>>>> >
> >>>>> >
> >>>>> >
> >>>>> > - Heavy write app, few reads.
> >>>>> >
> >>>>> > - Lots of updates of rows / columns.
> >>>>> >
> >>>>> > - Current performance is fine, for both writes and reads..
> >>>>> >
> >>>>> > - Currently using SizedCompactionStrategy
> >>>>> >
> >>>>> >
> >>>>> >
> >>>>> > We're trying to limit the amount of storage used during compaction.
> >>>>> > Should
> >>>>> > we switch to LeveledCompactionStrategy?
> >>>>> >
> >>>>> >
> >>>>> >
> >>>>> > Thanks
> >>>>> >
> >>>>> >
> >>>>> >
> >>>>> >
> >>>>> > --
> >>>>> >
> >>>>> > Nikolai Grigoriev
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Nikolai Grigoriev
> >>>>
> >>>
> >>
> >
> >
> >
> > --
> > Nikolai Grigoriev
> >
>



-- 
Nikolai Grigoriev
(514) 772-5178

Mime
View raw message