Rereading through everything again I am starting to wonder if the page cache is being affected by compaction. We have been heavily loading data for weeks and compaction is basically running non-stop. The manual compaction should be done some time tomorrow, so when totally caught up I will try again. What changes can be hoped for in 1470 or 1882 in terms of isolating compactions (or writes) affects on read requests?

Thanks



On Sat, Dec 18, 2010 at 2:36 PM, Peter Schuller <peter.schuller@infidyne.com> wrote:
> You are absolutely back to my main concern. Initially we were consistently
> seeing < 10ms read latency and now we see 25ms (30GB sstable file), 50ms
> (100GB sstable file) and 65ms (330GB table file) read times for a single
> read with nothing else going on in the cluster. Concurrency is not our
> problem/concern (at this point), our problem is slow reads in total
> isolation. Frankly the concern is that a 2TB node with a 1TB sstable (worst
> case scenario) will result in > 100ms read latency in total isolation.

So if you have a single non-concurrent client, along, submitting these
reads that take 65 ms - are you disk bound (according to the last
column of iostat -x 1), and how many reads per second (rps column) are
you seeing relative to client reads? Is the number of disk reads per
client read consistent with the actual number of sstables at the time?

The behavior you're describing really does seem indicative of a
problem, unless the the bottleneck is legitimately reads from disk
from multiple sstables resulting from rows being spread over said
sstables.

--
/ Peter Schuller