cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vasileios Vlachos <vasileiosvlac...@gmail.com>
Subject Re: Cassandra running out of memory?
Date Sun, 15 Apr 2012 12:36:13 GMT
Thank you Aaron. 8G memory is about the spec we use now for testing.

I observed a couple of other things when checked the output.log file but 
I think this should go to another post.

Thank you very much for your advice.

Bill


On 13/04/12 02:49, aaron morton wrote:
> It depends on a lot of things: schema size, caches, work load etc.
>
> If your are just starting out I would recommend using a machine with 
> 8gb or 16gb total ram. By default cassandra will take about 4gb or 8gb 
> (respectively) for the JVM.
>
> Once you have a feel for how things work you should be able to 
> estimate the resources your application will need.
>
> Hope that helps.
>
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 13/04/2012, at 2:19 AM, Vasileios Vlachos wrote:
>
>> Hello Aaron,
>>
>> Thank you for getting back to me.
>>
>> I will change to m1.large first to see how long it will take 
>> Cassandra node to die (if at all). If again not happy I will try more 
>> memory. I just want to test it step by step and see what the 
>> differences are. I will also change the cassandra-env file back to 
>> defaults.
>>
>> Is there an absolute minimum requirement for Cassandra in terms of 
>> memory? I might be wrong, but from my understanding we shouldn't have 
>> any problems given the amount of data we store per day (currently 
>> approximately 2-2.5G / day).
>>
>> Thank you in advance,
>>
>> Bill
>>
>>
>> On Wed, Apr 11, 2012 at 7:33 PM, aaron morton 
>> <aaron@thelastpickle.com <mailto:aaron@thelastpickle.com>> wrote:
>>
>>>     'system_memory_in_mb' (3760) and the 'system_cpu_cores' (1)
>>>     according to our nodes' specification. We also changed the
>>>     'MAX_HEAP_SIZE' to 2G and the 'HEAP_NEWSIZE' to 200M (we think
>>>     the second is related to the Garbage Collection).
>>     It's best to leave the default settings unless you know what you
>>     are doing here.
>>
>>>     In case you find this useful, swap is off and unevictable memory
>>>     seems to be very high on all 3 servers (2.3GB, we usually
>>>     observe the amount of unevictable memory on other Linux servers
>>>     of around 0-16KB)
>>     Cassandra locks the java memory so it cannot be swapped out.
>>
>>>     The problem is that the node we hit from our thrift interface
>>>     dies regularly (approximately after we store 2-2.5G of data).
>>>     Error message: OutOfMemoryError: Java Heap Space and according
>>>     to the log it in fact used all of the allocated memory.
>>     The easiest solution will be to use a larger EC2 instance.
>>
>>     People normally use an m1.xlarge with 16Gb of ram (you would also
>>     try an m1.large).
>>
>>     If you are still experimenting I would suggest using the larger
>>     instances so you can make some progress. Once you have a feel for
>>     how things work you can then try to match the instances to your
>>     budget.
>>
>>     Hope that helps.
>>
>>     -----------------
>>     Aaron Morton
>>     Freelance Developer
>>     @aaronmorton
>>     http://www.thelastpickle.com <http://www.thelastpickle.com/>
>>
>>     On 11/04/2012, at 1:54 AM, Vasileios Vlachos wrote:
>>
>>>     Hello,
>>>
>>>     We are experimenting a bit with Cassandra lately (version 1.0.7)
>>>     and we seem to have some problems with memory. We use EC2 as our
>>>     test environment and we have three nodes with 3.7G of memory and
>>>     1 core @ 2.4G, all running Ubuntu server 11.10.
>>>
>>>     The problem is that the node we hit from our thrift interface
>>>     dies regularly (approximately after we store 2-2.5G of data).
>>>     Error message: OutOfMemoryError: Java Heap Space and according
>>>     to the log it in fact used all of the allocated memory.
>>>
>>>     The nodes are under relatively constant load and store about
>>>     2000-4000 row keys a minute, which are batched through the Trift
>>>     interface in 10-30 row keys at once (with about 50 columns
>>>     each). The number of reads is very low with around 1000-2000 a
>>>     day and only requesting the data of a single row key. The is
>>>     currently only one used column family.
>>>
>>>     The initial thought was that something was wrong in the
>>>     cassandra-env.sh file. So, we specified the variables
>>>     'system_memory_in_mb' (3760) and the 'system_cpu_cores' (1)
>>>     according to our nodes' specification. We also changed the
>>>     'MAX_HEAP_SIZE' to 2G and the 'HEAP_NEWSIZE' to 200M (we think
>>>     the second is related to the Garbage Collection). Unfortunately,
>>>     that did not solve the issue and the node we hit via thrift
>>>     keeps on dying regularly.
>>>
>>>     In case you find this useful, swap is off and unevictable memory
>>>     seems to be very high on all 3 servers (2.3GB, we usually
>>>     observe the amount of unevictable memory on other Linux servers
>>>     of around 0-16KB) (We are not quite sure how the unevictable
>>>     memory ties into Cassandra, its just something we observed while
>>>     looking into the problem). The CPU is pretty much idle the
>>>     entire time. The heap memory is clearly being reduced once in a
>>>     while according to nodetool, but obviously grows over the limit
>>>     as time goes by.
>>>
>>>     Any ideas? Thanks in advance.
>>>
>>>     Bill
>>
>>
>


-- 

Kind regards,

Vasileios Vlachos


Mime
View raw message