incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ran Tavory <ran...@gmail.com>
Subject Re: StackOverflowError on high load
Date Thu, 18 Feb 2010 06:16:51 GMT
I found another interesting graph, attached.
I looked at the write-count and write-latency of the CF I'm writing to and I
see a few interesting things:
1. the host test2 crashed at 18:00
2. At 16:00, after a few hours of load both hosts dropped their write-count.
test1 (which did not crash) started slowing down first and then test2
slowed.
3. At 16:00 I start seeing high write-latency on test2 only. This takes
about 2h until finally at 18:00 it crashes.

Does this help?

On Thu, Feb 18, 2010 at 7:44 AM, Ran Tavory <rantav@gmail.com> wrote:

> I ran the process again and after a few hours the same node crashed the
> same way. Now I can tell for sure this is indeed what Jonathan proposed -
> the data directory needs to be 2x of what it is, but it looks like a design
> problem, how large to I need to tell my admin to set it then?
>
> Here's what I see when the server crashes:
>
> $ df -h /outbrain/cassandra/data/
> Filesystem            Size  Used Avail Use% Mounted on
> /dev/mapper/cassandra-data
>                        97G   46G   47G  50% /outbrain/cassandra/data
>
> The directory is 97G and when the host crashes it's at 50% use.
> I'm also monitoring various JMX counters and I see that COMPACTION-POOL
> PendingTasks grows for a while on this host (not on the other host, btw,
> which is fine, just this host) and then flats for 3 hours. After 3 hours of
> flat it crashes. I'm attaching the graph.
>
> When I restart cassandra on this host (not changed file allocation size,
> just restart) it does manage to compact the data files pretty fast, so after
> a minute I get 12% use, so I wonder what made it crash before that doesn't
> now? (could be the load that's not running now)
> $ df -h /outbrain/cassandra/data/
> Filesystem            Size  Used Avail Use% Mounted on
> /dev/mapper/cassandra-data
>                        97G   11G   82G  12% /outbrain/cassandra/data
>
> The question is what size does the data directory need to be? It's not 2x
> the size of the data I expect to have (I only have 11G of real data after
> compaction and the dir is 97G, so it should have been enough). If it's 2x of
> something dynamic that keeps growing and isn't bound then it'll just
> grow infinitely, right? What's the bound?
> Alternatively, what jmx counter thresholds are the best indicators for the
> crash that's about to happen?
>
> Thanks
>
>
> On Wed, Feb 17, 2010 at 9:00 PM, Tatu Saloranta <tsaloranta@gmail.com>wrote:
>
>> On Wed, Feb 17, 2010 at 6:40 AM, Ran Tavory <rantav@gmail.com> wrote:
>> > If it's the data directory, then I have a pretty big one. Maybe it's
>> > something else
>> > $ df -h /outbrain/cassandra/data/
>> > Filesystem            Size  Used Avail Use% Mounted on
>> > /dev/mapper/cassandra-data
>> >                        97G   11G   82G  12% /outbrain/cassandra/data
>>
>> Perhaps a temporary file? JVM defaults to /tmp, which may be on a
>> smaller (root) partition?
>>
>> -+ Tatu +-
>>
>
>

Mime
View raw message