cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Schuller <>
Subject Re: Read Latency Degradation
Date Sun, 19 Dec 2010 13:24:17 GMT
> Rereading through everything again I am starting to wonder if the page cache
> is being affected by compaction. We have been heavily loading data for weeks
> and compaction is basically running non-stop. The manual compaction should
> be done some time tomorrow, so when totally caught up I will try again.

If your 65 ms measurements were taken as an average while
compaction/repair was running, that would most definitely be a very
very likely candidate for a root cause. Especially if your compaction
is disk bound or close to it (rather than CPU bound).

What made me concerned was that it sounded like you were getting the
65ms latencies to reads with *no* other activity going on. But was
compaction/repair still running at that point?

And yes - definitely make sure to time it again when there's no active
compaction/repair going on.

> What
> changes can be hoped for in 1470 or 1882 in terms of isolating compactions
> (or writes) affects on read requests?

Speaking only for myself now and my expectations (not making any
statements officially for cassandra):

Under the assumption of large data sets with disk I/O and cache
effectiveness being the primary concerns, the negative impact of
background bulk I/O is falling into two categories:

(1) Direct impact on latency resulting from the I/O being done at any
given moment.
(2) Indirect impact resulting from eviction of hot data from page cache.

1470 is part of decimating (2). It sounds like 1470 itself will be
closed with fadvise working, but there is more to be done to achieve a
final goal of mitigating (2). Various options are discussed in 1470
itself; I guess the latest is the fadvise+mincore plan provided that
it pans out. It is worth noting though that barring a user-level page
cache, the effect of (2) will likely never be completely eliminated.
Even given fadvise+mincore, there are other concerns such as blowing
away recenticity information and defeating the LRU behavior (or
similar) of the OS page cache.

1882 is about controlling (1) and it is considerably easier to get
something "good enough" working for 1882 than 1470. Although certainly
the general problem of I/O scheduling is a difficult one, given the
specific use-case in Cassandra and the low hanging fruit to be picked,
I expect 1882 even in it's simplest form to significantly help for (1)
(but this only matters if (1) is your problem; if you are sufficiently
CPU bound already so that I/O is sufficiently rate limited in practice
anyway, 1882 will make no difference at all).

/ Peter Schuller

View raw message