cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kevin O'Connor" <ke...@reddit.com>
Subject Re: STCS Compaction with wide rows & TTL'd data
Date Fri, 02 Sep 2016 20:56:46 GMT
On Fri, Sep 2, 2016 at 9:33 AM, Mark Rose <markrose@markrose.ca> wrote:

> Hi Kevin,
>
> The tombstones will live in an sstable until it gets compacted. Do you
> have a lot of pending compactions? If so, increasing the number of
> parallel compactors may help.


Nope, we are pretty well managed on compactions. Only ever 1 or 2 running
at a time per node.


> You may also be able to tun the STCS
> parameters. Here's a good explanation of how it works:
> https://shrikantbang.wordpress.com/2014/04/22/size-
> tiered-compaction-strategy-in-apache-cassandra/


Yeah interesting - I'd like to try that. Is there a way to verify what the
settings are before changing them? DESCRIBE TABLE doesn't seem to show the
compaction subproperties.


> Anyway, LCS would probably be a better fit for your use case. LCS
> would help with eliminating tombstones, but it may also result in
> dramatically higher CPU usage for compaction. If LCS compaction can
> keep up, in addition to getting ride of tombstones faster, LCS should
> reduce the number of sstables that must be read to return the row and
> have a positive impact on read latency. STCS is a bad fit for rows
> that are updated frequently (which includes rows with TTL'ed data).
>

Thanks - that may end up being where we go with this.

Also, you may have an error in your application design. OAuth Access
> Tokens are designed to have a very short lifetime of seconds or
> minutes. On access token expiry, a Refresh Token should be used to get
> a new access token. A long-lived access token is a dangerous thing as
> there is no way to disable it (refresh tokens should be disabled to
> prevent the creation of new access tokens).
>

Yeah, noted. We only allow longer lived access tokens in some very specific
scenarios, so they are much less likely to be in that CF than the standard
3600s ones, but they're there.


>
> -Mark
>
> On Thu, Sep 1, 2016 at 3:53 AM, Kevin O'Connor <kevin@reddit.com> wrote:
> > We're running C* 1.2.11 and have two CFs, one called OAuth2AccessToken
> and
> > one OAuth2AccessTokensByUser. OAuth2AccessToken has the token as the row
> > key, and the columns are some data about the OAuth token. There's a TTL
> set
> > on it, usually 3600, but can be higher (up to 1 month).
> > OAuth2AccessTokensByUser has the user as the row key, and then all of the
> > user's token identifiers as column values. Each of the column values has
> a
> > TTL that is set to the same as the access token it corresponds to.
> >
> > The OAuth2AccessToken CF takes up around ~6 GB on disk, whereas the
> > OAuth2AccessTokensByUser CF takes around ~110 GB. If I use
> sstablemetadata,
> > I can see the droppable tombstones ratio is around 90% for the larger
> > sstables.
> >
> > My question is - why aren't these tombstones getting compacted away? I'm
> > guessing that it's because we use STCS and the large sstables that have
> > built up over time are never considered for compaction. Would LCS be a
> > better fit for the issue of trying to keep the tombstones in check?
> >
> > I've also tried forceUserDefinedCompaction via JMX on some of the largest
> > sstables and it just creates a new sstable of the exact same size, which
> was
> > pretty surprising. Why would this explicit request to compact an sstable
> not
> > remove tombstones?
> >
> > Thanks!
> >
> > Kevin
>

Mime
View raw message