cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Curt Bererton <c...@zipzapplay.com>
Subject Re: Problems running Cassandra 0.6.1 on large EC2 instances.
Date Fri, 21 May 2010 20:31:23 GMT
We can get Cassandra to run great for a few hours now.  Writing to and
reading from cassandra work well and the read/write times are good etc. We
also changed our config to enable row caching (we're hoping to ditch our
memcache server layer entirely).

Unfortunately, running on an EC2  High Memory extra large instance with
batch mode led to huge iowait on the cpu with only 20% of our traffic. We
don't have the commit log on a different disk yet, but it still seemed much
higher than it should have been. On Jonathan's recommendation we changed to
periodic mode in storage-conf.xml. This fixed the io wait problem, but the
machines went down hard after a few million writes. Unfortunately I don't
have any jmx or jvm level debugging (other than command line stuff) so I
don't have a ton of insight yet as to why it choked.

The main symptoms are memory dropping to zero and the cpu shooting up to
100% very suddenly. Typically CPU shot up to 100% at roughly the same time
for all machines.

We have two hypotheses:

   - our php client is connection leaking somehow
   - the GC kicks in and has so much memory to clean up ( the heap is at 12
   Gigs) that it takes forever and while the GC is running and eating cpu
   something else goes wrong.


I'm hooking up jcollectd to cassandra to see if we can find out more.

If anyone has any other suggestions please let me know.

C

--
Curt, ZipZapPlay Inc., www.PlayCrafter.com,
http://apps.facebook.com/bakinglife http://apps.facebook.com/happyhabitat


On Fri, May 21, 2010 at 12:53 PM, S Ahmed <sahmed1020@gmail.com> wrote:

> curious how did things turn out?
>
>
> On Tue, May 18, 2010 at 1:38 PM, Curt Bererton <curt@zipzapplay.com>wrote:
>
>> We only have a few CFs (6 or 7).  I've increased the MemtableThroughputInMB
>> and MemtableOperationsInMillions as per your suggestions. Do we really
>> need a swap file though? I suppose it can't hurt, but with my problem in
>> particular we weren't maxing out main memory.
>>
>> We'll be running another test today and see if the settings changes
>> proposed so far fix our problem ( I hope so ).
>>
>> Best,
>> Curt
>>
>>
>> On Tue, May 18, 2010 at 5:59 AM, Lee Parker <lee@socialagency.com> wrote:
>>
>>> How many different CFs do you have?  If you only have a few, I would
>>> highly recommend increasing the MemtableThroughputInMB and MemtableOperationsInMillions.
>>>  We only have to CFs and I have it set at 256MB and 2.5m. Since most of our
>>> columns are relatively small, these values are practically equivalent to
>>> each other.  I would also recommend dropping your heap space to 6G and
>>> adding a swap file.  In our case, the large EC2 instances didn't have any
>>> swap setup by default.
>>>
>>> Lee Parker
>>>
>>>
>>>
>>
>

Mime
View raw message