cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joshua McKenzie (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-8833) Stop opening compaction results early
Date Thu, 19 Feb 2015 19:48:14 GMT


Joshua McKenzie commented on CASSANDRA-8833:

bq. Make your mind up
Touché.  :)  I have numbers on another ticket and Branimir's been seeing similar effects
on the read path on Windows, so this is more a symptom of me being lazy on bringing them into
this discussion (combined with a lack of rigor on producing those #'s thus far).

bq. the two aren't mutually exclusive
Absolutely true, but the headache associated with renaming or deleting files on Windows is
compounded with hard-links and memory-mapping. Delaying the mapping until finalization of
the sstable would be a simple solution however that's adding more complexity on top of an
already complex situation.  Thus far we've avoided platform-specific code-paths as much as
possible, but that seems a simple enough solution that it would be worth looking into.

Regarding the flurry of recent fixes - as you're well aware, little in this code-base is as
simple as it may appear at first glance.  I suspect we're going to have other things we need
to tidy up with these recent commits and there may be unintended side-effects to some of those
changes.  A lot of hand-waving here, certainly, but past experience indicates that changes
that touch that many places in the code-base almost always have some surprises in store for

bq. I'm not certain what you're referring to here
Ah - so the 9% w/populate_io_cache_on_flush was the pathological case and the crazy cliff
drop-off was the normal use-case then?  A bit of clarification there goes a long way.

> Stop opening compaction results early
> -------------------------------------
>                 Key: CASSANDRA-8833
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Marcus Eriksson
>             Fix For: 2.1.4
> We should simplify the code base by not doing early opening of compaction results. It
makes it very hard to reason about sstable life cycles since they can be in many different
states, "opened early", "starts moved", "shadowed", "final", instead of as before, basically
just one (tmp files are not really 'live' yet so I don't count those). The ref counting of
shared resources between sstables in these different states is also hard to reason about.
This has caused quite a few issues since we released 2.1
> I think it all boils down to a performance vs code complexity issue, is opening compaction
results early really 'worth it' wrt the performance gain? The results in CASSANDRA-6916 sure
look like the benefits are big enough, but the difference should not be as big for people
on SSDs (which most people who care about latencies are)
> WDYT [~benedict] [~jbellis] [~iamaleksey] [~JoshuaMcKenzie]?

This message was sent by Atlassian JIRA

View raw message