incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benjamin Black...@b3k.us>
Subject Re: Node OOM Problems
Date Sun, 22 Aug 2010 20:37:08 GMT
Wayne,

Bulk loading this much data is a very different prospect from needing
to sustain that rate of updates indefinitely.  As was suggested
earlier, you likely need to tune things differently, including
disabling minor compactions during the bulk load, to make this work
efficiently.


b

On Sun, Aug 22, 2010 at 12:40 PM, Wayne <wav100@gmail.com> wrote:
> Has anyone loaded 2+ terabytes of real data in one stretch into a cluster
> without bulk loading and without any problems? How long did it take? What
> kind of nodes were used? How many writes/sec/node can be sustained for 24+
> hours?
>
>
>
> On Sun, Aug 22, 2010 at 8:22 PM, Peter Schuller
> <peter.schuller@infidyne.com> wrote:
>>
>> I only sifted recent history of this thread (for time reasons), but:
>>
>> > You have started a major compaction which is now competing with those
>> > near constant minor compactions for far too little I/O (3 SATA drives
>> > in RAID0, perhaps?).  Normally, this would result in a massive
>> > ballooning of your heap use as all sorts of activities (like memtable
>> > flushes) backed up, as well.
>>
>> AFAIK memtable flushing is unrelated to compaction in the sense that
>> they occur concurrently and don't block each other (except to the
>> extent that they truly do compete for e.g. disk or CPU resources).
>>
>> While small memtables do indeed mean more compaction activity in
>> total, the expensiveness of any given compaction should not be
>> severely affecting.
>>
>> As far as I can tell, the two primary effects of small memtable sizes are:
>>
>> * An increase in total amount of compaction work done in total for a
>> given database size.
>> * An increase in the number of sstables that may accumulate while
>> larger compactions are running.
>> ** That in turn is particularly relevant because it can generate a lot
>> of seek-bound activity; consider for example range queries that end up
>> spanning 10 000 files on disk.
>>
>> If memtable flushes are not able to complete fast enough to cope with
>> write activity, even if that is the case only during concurrenct
>> compaction (for whatever reason), that suggests to me that write
>> activity is too high. Increasing memtable sizes may help on average
>> due to decreased compaction work, but I don't see why it would
>> significantly affect the performance one compactions *do* in fact run.
>>
>> With respect to timeouts on writes: I make no claims as to whether it
>> is expected, because I have not yet investigated, but I definitely see
>> sporadic slowness when benchmarking high-throughput writes on a
>> cassandra trunk snapshot somewhere between 0.6 and 0.7. This occurs
>> even when writing to a machine where the commit log and data
>> directories are both on separate RAID volumes that are battery backed
>> and should have no trouble eating write bursts (and the data is such
>> that one is CPU bound  rather than diskbound on average; so it only
>> needs to eat bursts).
>>
>> I've had to add re-try to the benchmarking tool (or else up the
>> timeout) because the default was not enough.
>>
>> I have not investigated exactly why this happens but it's an
>> interesting effect that as far as I can tell should not be there.
>> Haver other people done high-throughput writes (to the point of CPU
>> saturation) over extended periods of time while consistently seeing
>> low latencies (consistencty meaning never exceeding hundreds of ms
>> over several days)?
>>
>>
>> --
>> / Peter Schuller
>
>

Mime
View raw message