cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Haddad <...@jonhaddad.com>
Subject Re: Effect of frequent mutations / memtable
Date Fri, 26 May 2017 22:50:46 GMT
If you're doing high volumes of writes that simply overwrite values, you're
going to see memtables flush to disk when the commit log hits it's space
limit and you recycle commit log segments.

I agree, it makes sense to not write these values to disk only to compact
them, if this is your pattern.

On Fri, May 26, 2017 at 2:15 PM Jan Algermissen <algermissen1971@icloud.com>
wrote:

> Jonathan,
>
> On 26 May 2017, at 17:00, Jonathan Haddad wrote:
>
> > If you have a small amount of hot data, enable the row cache. The
> > memtable
> > is not designed to be a cache. You will not see a massive performance
> > impact of writing one to disk. Sstables will be in your page cache,
> > meaning
> > you won't be hitting disk very often.
>
> What I (and AFAIU Max, too) am concerned with is very frequent updates
> on certain cells and their impact on the amount of SSTables created.
>
> Suppose I have a row that sees tens of thousands of mutations during the
> first minutes of its lifetime but isn't changed afterwards. The
> hope/assumption is that tuning C* can help having all those mutations
> take place in the memtable so we end up with only a single SSTable in
> the end (roughly speaking).
>
> Besides such an exceptional case I'd consider high-frequent mutations an
> anti pattern due to the SSTables bloat.
>
> Makes sense?
>
> Jan
>
>
>
>
> > On Fri, May 26, 2017 at 7:41 AM Max C <mc_cassandra@core43.com> wrote:
> >
> >> In my case, we're using Cassandra to store QA test data — so the
> >> pattern
> >> is that we may do a bunch of updates within a few minutes / hours,
> >> and then
> >> the data will essentially be read-only for the rest of its lifetime
> >> (years).  My question is the same — do we need to worry about the
> >> performance impact of having N mutations written to the SSTable —
> >> or will
> >> these mutations generally be constrained to the mem table?
> >>
> >> - Max
> >>
> >>> Hi,
> >>>
> >>> I am using a updates to a column with a ttl to represent a lock. The
> >> owning process keeps updating the lock's TTL as long as it is
> >> running. If
> >> the process crashes, the lock will timeout and be deleted. Then
> >> another
> >> process can take over.
> >>>
> >>> I have used this pattern very successfully over years with TTLs in
> >>> the
> >> order of tens of seconds.
> >>>
> >>> Now I have a use case in mind that would require much smaller TTLs,
> >>> e.g.
> >> 1 or two seconds and I am worried about the increased number of
> >> mutations
> >> and possible effect on SSTables.
> >>>
> >>> However: I'd assume these frequent updates on a cell to mostly
> >>> happen in
> >> the memtable resulting in only occasional manifestation in SSTables.
> >>>
> >>> Is that assumption correct and if so, what config parameters should
> >>> I
> >> tweak to keep the memtable from being flushed for longer periods of
> >> time?
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> >> For additional commands, e-mail: user-help@cassandra.apache.org
> >>
> >>
>
>
>

Mime
View raw message