cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: GCInspector info messages in cassandra log
Date Wed, 15 Aug 2012 21:11:44 GMT
> Is there anything to do before that? like drain or flush?
For a clean shutdown I do

nodetool -h localhost disablethrift 
nodetool -h localhost disablegossip && sleep 10
nodetool -h localhost drain
then kill 

> Would you recommend that? If I do it, how often should I do a full snapshot, and how
often should I backup the backup directory?
Sounds like you could use Priam and be happier...
http://techblog.netflix.com/2012/02/announcing-priam.html

>  I just saw that there is an option global_snapshot, is it still supported?
I cannot find it. 

Try Piram or the instructions here, which are pretty much what you have described http://www.datastax.com/docs/1.0/operations/backup_restore

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 15/08/2012, at 4:57 PM, Tamar Fraenkel <tamar@tok-media.com> wrote:

> Aaron,
> Thank you very much. I will do as you suggested.
> 
> One last question regarding restart:
> I assume, I should do it node by node.
> Is there anything to do before that? like drain or flush?
> 
> I am also considering enabling incremental backups on my cluster. Currently I take a
daily full snapshot of the cluster, tar it and load it to S3 (size now is 3.1GB). Would you
recommend that? If I do it, how often should I do a full snapshot, and how often should I
backup the backup directory?
> 
> Another snapshot related question, currently I snapshot on each node and use parallel-slurp
to copy the snapshot to one node where I tar them. I just saw that there is an option global_snapshot,
is it still supported? Does that mean that if I run it on one node the snapshot will contain
data from all cluster? How does it work in restore? Is it better than my current backup system?
> 
> Tamar Fraenkel 
> Senior Software Engineer, TOK Media 
> 
> 
> 
> tamar@tok-media.com
> Tel:   +972 2 6409736 
> Mob:  +972 54 8356490 
> Fax:   +972 2 5612956 
> 
> 
> 
> 
> 
> On Tue, Aug 14, 2012 at 11:51 PM, aaron morton <aaron@thelastpickle.com> wrote:
>> According to cfstats there are the some CF with high Comacted row maximum sizes (1131752,
4866323 and 25109160). Others max sizes are < 1000000. Are these considered to be problematic,
what can I do to solve that?
> They are only 1, 4 and 25 MB. Not too big. 
> 
>> What should be the values of  in_memory_compaction_limit_in_mb  and concurrent_compactors
and how do I change them?
> Sounds like you dont have very big CF's, so changing the in_memory_compaction_limit_in_mb
may not make too much difference. 
> 
> Try changing concurrent_compactors to 2 in the yaml file. This change will let you know
if GC and compaction are related.
>   
>>  change yaml file and restart,
> yes
> 
>> What do I do about the long rows? What value is considered too big.
> They churn more memory during compaction. If you have a lot of rows +32 MB I would think
about it, does not look that way. 
> 
> Cheers
> 
> 
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 15/08/2012, at 3:15 AM, Tamar Fraenkel <tamar@tok-media.com> wrote:
> 
>> Hi!
>> It helps, but before I do more actions I want to give you some more info, and ask
some questions:
>> 
>> Related Info
>> According to my yaml file (where do I see these parameters in the jmx? I couldn't
find them):
>> in_memory_compaction_limit_in_mb: 64
>> concurrent_compactors: 1, but it is commented out, so I guess it is the default value
>> multithreaded_compaction: false
>> compaction_throughput_mb_per_sec: 16
>> compaction_preheat_key_cache: true
>> According to cfstats there are the some CF with high Comacted row maximum sizes (1131752,
4866323 and 25109160). Others max sizes are < 1000000. Are these considered to be problematic,
what can I do to solve that?
>> During compactions Cassandra is slower
>> Running Cassandra Version 1.0.8
>> Questions
>> What should be the values of  in_memory_compaction_limit_in_mb  and concurrent_compactors
and how do I change them? change yaml file and restart, or can it be done using jmx without
restarting Cassandra?
>> What do I do about the long rows? What value is considered too big.
>> 
>> I appreciate your help!
>> Thanks,
>> 
>> 
>> 
>> Tamar Fraenkel 
>> Senior Software Engineer, TOK Media 
>> 
>> <tokLogo.png>
>> 
>> tamar@tok-media.com
>> Tel:   +972 2 6409736 
>> Mob:  +972 54 8356490 
>> Fax:   +972 2 5612956 
>> 
>> 
>> 
>> 
>> 
>> On Tue, Aug 14, 2012 at 1:22 PM, aaron morton <aaron@thelastpickle.com> wrote:
>> There are a couple of steps you can take if compaction is causing GC. 
>> 
>> - if you have a lot of wide rows consider reducing the in_memory_compaction_limit_in_mb
yaml setting. This will slow down compaction but will reduce the memory usage. 
>> 
>> - reduce concurrent_compactors 
>> 
>> Both of these may slow down compaction. Once you have GC under control you may want
to play with memory settings.
>> 
>> Hope that helps.  
>> -----------------
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>> 
>> On 14/08/2012, at 4:45 PM, Tamar Fraenkel <tamar@tok-media.com> wrote:
>> 
>>> Hi!
>>> I have 3 nodes ring running on Amazon EC2.
>>> About once a week I see in the logs compaction messages and around the same time
info messages about GC (see below) that I think means it is taking too long and happening
too often.
>>> 
>>> Does it mean I have to reduce my cache size?
>>> Thanks,
>>> Tamar
>>> 
>>>  INFO [ScheduledTasks:1] 2012-08-13 12:50:57,593 GCInspector.java (line 122)
GC for ParNew: 242 ms for 1 collections, 1541590352 used; max is 1937768448
>>>  INFO [ScheduledTasks:1] 2012-08-13 12:51:27,740 GCInspector.java (line 122)
GC for ParNew: 291 ms for 1 collections, 1458227032 used; max is 1937768448
>>>  INFO [ScheduledTasks:1] 2012-08-13 12:51:29,741 GCInspector.java (line 122)
GC for ParNew: 261 ms for 1 collections, 1228861368 used; max is 1937768448
>>>  INFO [ScheduledTasks:1] 2012-08-13 12:51:30,833 GCInspector.java (line 122)
GC for ParNew: 319 ms for 1 collections, 1120131360 used; max is 1937768448
>>>  INFO [ScheduledTasks:1] 2012-08-13 12:51:32,863 GCInspector.java (line 122)
GC for ParNew: 241 ms for 1 collections, 983144216 used; max is 1937768448
>>>  INFO [ScheduledTasks:1] 2012-08-13 12:51:33,864 GCInspector.java (line 122)
GC for ParNew: 215 ms for 1 collections, 967702720 used; max is 1937768448
>>>  INFO [ScheduledTasks:1] 2012-08-13 12:51:34,964 GCInspector.java (line 122)
GC for ParNew: 248 ms for 1 collections, 973803344 used; max is 1937768448
>>>  INFO [ScheduledTasks:1] 2012-08-13 12:51:41,211 GCInspector.java (line 122)
GC for ParNew: 265 ms for 1 collections, 1071933560 used; max is 1937768448
>>>  INFO [ScheduledTasks:1] 2012-08-13 12:51:43,212 GCInspector.java (line 122)
GC for ParNew: 326 ms for 1 collections, 1217367792 used; max is 1937768448
>>>  INFO [ScheduledTasks:1] 2012-08-13 12:51:44,212 GCInspector.java (line 122)
GC for ParNew: 245 ms for 1 collections, 1203481536 used; max is 1937768448
>>>  INFO [ScheduledTasks:1] 2012-08-13 12:51:45,213 GCInspector.java (line 122)
GC for ParNew: 209 ms for 1 collections, 1208819416 used; max is 1937768448
>>>  INFO [ScheduledTasks:1] 2012-08-13 12:51:46,237 GCInspector.java (line 122)
GC for ParNew: 248 ms for 1 collections, 1338361648 used; max is 1937768448
>>> 
>>> 
>>> Tamar Fraenkel 
>>> Senior Software Engineer, TOK Media 
>>> 
>>> <tokLogo.png>
>>> 
>>> 
>>> tamar@tok-media.com
>>> Tel:   +972 2 6409736 
>>> Mob:  +972 54 8356490 
>>> Fax:   +972 2 5612956 
>>> 
>>> 
>>> 
>> 
>> 
> 
> 


Mime
View raw message