incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michał Michalski <mich...@opera.com>
Subject Re: Cassandra Heap Size for data more than 1 TB
Date Thu, 03 Oct 2013 13:22:42 GMT
Currently we have 480-520 GB of data per node, so it's not even close to 
1TB, but I'd bet that reaching 700-800GB shouldn't be a problem in terms 
of "everyday performance" - heap space is quite low, no GC issues etc. 
(to give you a comparison: when working on 1.1 and having ~300-400GB per 
node we had a huge problem with bloom filters and heap space, so we had 
to bump it to 12-16 GB; on 1.2 it's not an issue anymore).

However, our main concern is the time that we'll need to rebuild broken 
node, so we are going to extend the cluster soon to avoid such problems 
and keep our nodes about 50% smaller.

M.


W dniu 03.10.2013 15:02, srmore pisze:
> Thanks Mohit and Michael,
> That's what I thought. I have tried all the avenues, will give ParNew a
> try. With the 1.0.xx I have issues when data sizes go up, hopefully that
> will not be the case with 1.2.
>
> Just curious, has anyone tried 1.2 with large data set, around 1 TB ?
>
>
> Thanks !
>
>
> On Thu, Oct 3, 2013 at 7:20 AM, Michał Michalski <michalm@opera.com> wrote:
>
>> I was experimenting with 128 vs. 512 some time ago and I was unable to see
>> any difference in terms of performance. I'd probably check 1024 too, but we
>> migrated to 1.2 and heap space was not an issue anymore.
>>
>> M.
>>
>> W dniu 02.10.2013 16:32, srmore pisze:
>>
>>   I changed my index_interval from 128 to index_interval: 128 to 512, does
>>> it
>>> make sense to increase more than this ?
>>>
>>>
>>> On Wed, Oct 2, 2013 at 9:30 AM, cem <cayiroglu@gmail.com> wrote:
>>>
>>>   Have a look to index_interval.
>>>>
>>>> Cem.
>>>>
>>>>
>>>> On Wed, Oct 2, 2013 at 2:25 PM, srmore <comomore@gmail.com> wrote:
>>>>
>>>>   The version of Cassandra I am using is 1.0.11, we are migrating to 1.2.X
>>>>> though. We had tuned bloom filters (0.1) and AFAIK making it lower than
>>>>> this won't matter.
>>>>>
>>>>> Thanks !
>>>>>
>>>>>
>>>>> On Tue, Oct 1, 2013 at 11:54 PM, Mohit Anchlia <mohitanchlia@gmail.com
>>>>>> wrote:
>>>>>
>>>>>   Which Cassandra version are you on? Essentially heap size is function
>>>>>> of
>>>>>> number of keys/metadata. In Cassandra 1.2 lot of the metadata like
>>>>>> bloom
>>>>>> filters were moved off heap.
>>>>>>
>>>>>>
>>>>>> On Tue, Oct 1, 2013 at 9:34 PM, srmore <comomore@gmail.com>
wrote:
>>>>>>
>>>>>>   Does anyone know what would roughly be the heap size for cassandra
>>>>>>> with
>>>>>>> 1TB of data ? We started with about 200 G and now on one of the
nodes
>>>>>>> we
>>>>>>> are already on 1 TB. We were using 8G of heap and that served
us well
>>>>>>> up
>>>>>>> until we reached 700 G where we started seeing failures and nodes
>>>>>>> flipping.
>>>>>>>
>>>>>>> With 1 TB of data the node refuses to come back due to lack of
memory.
>>>>>>> needless to say repairs and compactions takes a lot of time.
We upped
>>>>>>> the
>>>>>>> heap from 8 G to 12 G and suddenly everything started moving
rapidly
>>>>>>> i.e.
>>>>>>> the repair tasks and the compaction tasks. But soon (in about
9-10
>>>>>>> hrs) we
>>>>>>> started seeing the same symptoms as we were seeing with 8 G.
>>>>>>>
>>>>>>> So my question is how do I determine what is the optimal size
of heap
>>>>>>> for data around 1 TB ?
>>>>>>>
>>>>>>> Following are some of my JVM settings
>>>>>>>
>>>>>>> -Xms8G
>>>>>>> -Xmx8G
>>>>>>> -Xmn800m
>>>>>>> -XX:NewSize=1200M
>>>>>>> XX:MaxTenuringThreshold=2
>>>>>>> -XX:SurvivorRatio=4
>>>>>>>
>>>>>>> Thanks !
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>


Mime
View raw message