incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alain RODRIGUEZ <arodr...@gmail.com>
Subject Re: Cassandra Heap Size for data more than 1 TB
Date Fri, 04 Oct 2013 09:15:30 GMT
Here.

We have 1.5 TB running smooth. index_interval: 1024 and 8GB JVM. Default
bloomfilters.
The only pb we have is that We have 2TB SSD so they are almost full, C*
starts crashing. It looks like cassandra consider there is no more space
available, when there is still 500GB available (You're not supposed to use
50%+ disk space).

All operations are slower of course with these loads (Bootstrap, Repair,
cleanup, ...).

Yet I read on datastax website that MAX size is around 300 - 500 GB for C*
< 1.2.x and 3 to 5 GB after (under certain conditions, but taking profit of
off heap BF / caches etc.). Vnodes should also help reducing the time
needed for some operations.

Hope that helps somehow





2013/10/3 Michał Michalski <michalm@opera.com>

> Currently we have 480-520 GB of data per node, so it's not even close to
> 1TB, but I'd bet that reaching 700-800GB shouldn't be a problem in terms of
> "everyday performance" - heap space is quite low, no GC issues etc. (to
> give you a comparison: when working on 1.1 and having ~300-400GB per node
> we had a huge problem with bloom filters and heap space, so we had to bump
> it to 12-16 GB; on 1.2 it's not an issue anymore).
>
> However, our main concern is the time that we'll need to rebuild broken
> node, so we are going to extend the cluster soon to avoid such problems and
> keep our nodes about 50% smaller.
>
> M.
>
>
> W dniu 03.10.2013 15:02, srmore pisze:
>
>  Thanks Mohit and Michael,
>> That's what I thought. I have tried all the avenues, will give ParNew a
>> try. With the 1.0.xx I have issues when data sizes go up, hopefully that
>> will not be the case with 1.2.
>>
>> Just curious, has anyone tried 1.2 with large data set, around 1 TB ?
>>
>>
>> Thanks !
>>
>>
>> On Thu, Oct 3, 2013 at 7:20 AM, Michał Michalski <michalm@opera.com>
>> wrote:
>>
>>  I was experimenting with 128 vs. 512 some time ago and I was unable to
>>> see
>>> any difference in terms of performance. I'd probably check 1024 too, but
>>> we
>>> migrated to 1.2 and heap space was not an issue anymore.
>>>
>>> M.
>>>
>>> W dniu 02.10.2013 16:32, srmore pisze:
>>>
>>>   I changed my index_interval from 128 to index_interval: 128 to 512,
>>> does
>>>
>>>> it
>>>> make sense to increase more than this ?
>>>>
>>>>
>>>> On Wed, Oct 2, 2013 at 9:30 AM, cem <cayiroglu@gmail.com> wrote:
>>>>
>>>>   Have a look to index_interval.
>>>>
>>>>>
>>>>> Cem.
>>>>>
>>>>>
>>>>> On Wed, Oct 2, 2013 at 2:25 PM, srmore <comomore@gmail.com> wrote:
>>>>>
>>>>>   The version of Cassandra I am using is 1.0.11, we are migrating to
>>>>> 1.2.X
>>>>>
>>>>>> though. We had tuned bloom filters (0.1) and AFAIK making it lower
>>>>>> than
>>>>>> this won't matter.
>>>>>>
>>>>>> Thanks !
>>>>>>
>>>>>>
>>>>>> On Tue, Oct 1, 2013 at 11:54 PM, Mohit Anchlia <
>>>>>> mohitanchlia@gmail.com
>>>>>>
>>>>>>> wrote:
>>>>>>>
>>>>>>
>>>>>>   Which Cassandra version are you on? Essentially heap size is
>>>>>> function
>>>>>>
>>>>>>> of
>>>>>>> number of keys/metadata. In Cassandra 1.2 lot of the metadata
like
>>>>>>> bloom
>>>>>>> filters were moved off heap.
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Oct 1, 2013 at 9:34 PM, srmore <comomore@gmail.com>
wrote:
>>>>>>>
>>>>>>>   Does anyone know what would roughly be the heap size for cassandra
>>>>>>>
>>>>>>>> with
>>>>>>>> 1TB of data ? We started with about 200 G and now on one
of the
>>>>>>>> nodes
>>>>>>>> we
>>>>>>>> are already on 1 TB. We were using 8G of heap and that served
us
>>>>>>>> well
>>>>>>>> up
>>>>>>>> until we reached 700 G where we started seeing failures and
nodes
>>>>>>>> flipping.
>>>>>>>>
>>>>>>>> With 1 TB of data the node refuses to come back due to lack
of
>>>>>>>> memory.
>>>>>>>> needless to say repairs and compactions takes a lot of time.
We
>>>>>>>> upped
>>>>>>>> the
>>>>>>>> heap from 8 G to 12 G and suddenly everything started moving
rapidly
>>>>>>>> i.e.
>>>>>>>> the repair tasks and the compaction tasks. But soon (in about
9-10
>>>>>>>> hrs) we
>>>>>>>> started seeing the same symptoms as we were seeing with 8
G.
>>>>>>>>
>>>>>>>> So my question is how do I determine what is the optimal
size of
>>>>>>>> heap
>>>>>>>> for data around 1 TB ?
>>>>>>>>
>>>>>>>> Following are some of my JVM settings
>>>>>>>>
>>>>>>>> -Xms8G
>>>>>>>> -Xmx8G
>>>>>>>> -Xmn800m
>>>>>>>> -XX:NewSize=1200M
>>>>>>>> XX:MaxTenuringThreshold=2
>>>>>>>> -XX:SurvivorRatio=4
>>>>>>>>
>>>>>>>> Thanks !
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message